# HG changeset patch # User Martin von Zweigbergk # Date 1490369846 25200 # Node ID 4baf79a77afa3f4734e0c51e501c7737a36757eb # Parent aea8ec3f7dd1967a05ecce8f779e16f7ad14fdee# Parent ed5b25874d998ababb181a939dd37a16ea644435 merge with stable diff -r ed5b25874d99 -r 4baf79a77afa Makefile --- a/Makefile Thu Mar 23 19:54:59 2017 -0700 +++ b/Makefile Fri Mar 24 08:37:26 2017 -0700 @@ -163,6 +163,16 @@ --root=build/mercurial/ --prefix=/usr/local/ \ --install-lib=/Library/Python/2.7/site-packages/ make -C doc all install DESTDIR="$(PWD)/build/mercurial/" + # install zsh completions - this location appears to be + # searched by default as of macOS Sierra. + mkdir -p build/mercurial/usr/local/share/zsh/site-functions + cp contrib/zsh_completion build/mercurial/usr/local/share/zsh/site-functions/hg + # install bash completions - there doesn't appear to be a + # place that's searched by default for bash, so we'll follow + # the lead of Apple's git install and just put it in a + # location of our own. + mkdir -p build/mercurial/usr/local/hg/contrib + cp contrib/bash_completion build/mercurial/usr/local/hg/contrib/hg-completion.bash mkdir -p $${OUTPUTDIR:-dist} HGVER=$$((cat build/mercurial/Library/Python/2.7/site-packages/mercurial/__version__.py; echo 'print(version)') | python) && \ OSXVER=$$(sw_vers -productVersion | cut -d. -f1,2) && \ @@ -262,5 +272,9 @@ .PHONY: help all local build doc cleanbutpackages clean install install-bin \ install-doc install-home install-home-bin install-home-doc \ dist dist-notests check tests check-code update-pot \ - osx fedora20 docker-fedora20 fedora21 docker-fedora21 \ + osx deb ppa docker-debian-jessie \ + docker-ubuntu-trusty docker-ubuntu-trusty-ppa \ + docker-ubuntu-xenial docker-ubuntu-xenial-ppa \ + docker-ubuntu-yakkety docker-ubuntu-yakkety-ppa \ + fedora20 docker-fedora20 fedora21 docker-fedora21 \ centos5 docker-centos5 centos6 docker-centos6 centos7 docker-centos7 diff -r ed5b25874d99 -r 4baf79a77afa contrib/check-code.py --- a/contrib/check-code.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/check-code.py Fri Mar 24 08:37:26 2017 -0700 @@ -237,7 +237,7 @@ (r'lambda\s*\(.*,.*\)', "tuple parameter unpacking not available in Python 3+"), (r'(?`_ compression library. A C extension -and CFFI interface is provided. +and CFFI interface are provided. -The primary goal of the extension is to provide a Pythonic interface to -the underlying C API. This means exposing most of the features and flexibility +The primary goal of the project is to provide a rich interface to the +underlying C API through a Pythonic interface while not sacrificing +performance. This means exposing most of the features and flexibility of the C API while not sacrificing usability or safety that Python provides. The canonical home for this project is @@ -23,6 +24,9 @@ may be some backwards incompatible changes before 1.0. Though the author does not intend to make any major changes to the Python API. +This project is vendored and distributed with Mercurial 4.1, where it is +used in a production capacity. + There is continuous integration for Python versions 2.6, 2.7, and 3.3+ on Linux x86_x64 and Windows x86 and x86_64. The author is reasonably confident the extension is stable and works as advertised on these @@ -48,14 +52,15 @@ support compression without the framing headers. But the author doesn't believe it a high priority at this time. -The CFFI bindings are half-baked and need to be finished. +The CFFI bindings are feature complete and all tests run against both +the C extension and CFFI bindings to ensure behavior parity. Requirements ============ -This extension is designed to run with Python 2.6, 2.7, 3.3, 3.4, and 3.5 -on common platforms (Linux, Windows, and OS X). Only x86_64 is currently -well-tested as an architecture. +This extension is designed to run with Python 2.6, 2.7, 3.3, 3.4, 3.5, and +3.6 on common platforms (Linux, Windows, and OS X). Only x86_64 is +currently well-tested as an architecture. Installing ========== @@ -106,15 +111,11 @@ Comparison to Other Python Bindings =================================== -https://pypi.python.org/pypi/zstd is an alternative Python binding to +https://pypi.python.org/pypi/zstd is an alternate Python binding to Zstandard. At the time this was written, the latest release of that -package (1.0.0.2) had the following significant differences from this package: - -* It only exposes the simple API for compression and decompression operations. - This extension exposes the streaming API, dictionary training, and more. -* It adds a custom framing header to compressed data and there is no way to - disable it. This means that data produced with that module cannot be used by - other Zstandard implementations. +package (1.1.2) only exposed the simple APIs for compression and decompression. +This package exposes much more of the zstd API, including streaming and +dictionary compression. This package also has CFFI support. Bundling of Zstandard Source Code ================================= @@ -260,6 +261,10 @@ compressor's internal state into the output object. This may result in 0 or more ``write()`` calls to the output object. +Both ``write()`` and ``flush()`` return the number of bytes written to the +object's ``write()``. In many cases, small inputs do not accumulate enough +data to cause a write and ``write()`` will return ``0``. + If the size of the data being fed to this streaming compressor is known, you can declare it before compression begins:: @@ -476,6 +481,10 @@ the decompressor by calling ``write(data)`` and decompressed output is written to the output object by calling its ``write(data)`` method. +Calls to ``write()`` will return the number of bytes written to the output +object. Not all inputs will result in bytes being written, so return values +of ``0`` are possible. + The size of chunks being ``write()`` to the destination can be specified:: dctx = zstd.ZstdDecompressor() @@ -576,6 +585,53 @@ data = dobj.decompress(compressed_chunk_0) data = dobj.decompress(compressed_chunk_1) +Content-Only Dictionary Chain Decompression +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``decompress_content_dict_chain(frames)`` performs decompression of a list of +zstd frames produced using chained *content-only* dictionary compression. Such +a list of frames is produced by compressing discrete inputs where each +non-initial input is compressed with a *content-only* dictionary consisting +of the content of the previous input. + +For example, say you have the following inputs:: + + inputs = [b'input 1', b'input 2', b'input 3'] + +The zstd frame chain consists of: + +1. ``b'input 1'`` compressed in standalone/discrete mode +2. ``b'input 2'`` compressed using ``b'input 1'`` as a *content-only* dictionary +3. ``b'input 3'`` compressed using ``b'input 2'`` as a *content-only* dictionary + +Each zstd frame **must** have the content size written. + +The following Python code can be used to produce a *content-only dictionary +chain*:: + + def make_chain(inputs): + frames = [] + + # First frame is compressed in standalone/discrete mode. + zctx = zstd.ZstdCompressor(write_content_size=True) + frames.append(zctx.compress(inputs[0])) + + # Subsequent frames use the previous fulltext as a content-only dictionary + for i, raw in enumerate(inputs[1:]): + dict_data = zstd.ZstdCompressionDict(inputs[i]) + zctx = zstd.ZstdCompressor(write_content_size=True, dict_data=dict_data) + frames.append(zctx.compress(raw)) + + return frames + +``decompress_content_dict_chain()`` returns the uncompressed data of the last +element in the input chain. + +It is possible to implement *content-only dictionary chain* decompression +on top of other Python APIs. However, this function will likely be significantly +faster, especially for long input chains, as it avoids the overhead of +instantiating and passing around intermediate objects between C and Python. + Choosing an API --------------- @@ -634,6 +690,13 @@ dict_data = zstd.ZstdCompressionDict(data) +It is possible to construct a dictionary from *any* data. Unless the +data begins with a magic header, the dictionary will be treated as +*content-only*. *Content-only* dictionaries allow compression operations +that follow to reference raw data within the content. For one use of +*content-only* dictionaries, see +``ZstdDecompressor.decompress_content_dict_chain()``. + More interestingly, instances can be created by *training* on sample data:: dict_data = zstd.train_dictionary(size, samples) @@ -700,19 +763,57 @@ cctx = zstd.ZstdCompressor(compression_params=params) -The members of the ``CompressionParameters`` tuple are as follows:: +The members/attributes of ``CompressionParameters`` instances are as follows:: -* 0 - Window log -* 1 - Chain log -* 2 - Hash log -* 3 - Search log -* 4 - Search length -* 5 - Target length -* 6 - Strategy (one of the ``zstd.STRATEGY_`` constants) +* window_log +* chain_log +* hash_log +* search_log +* search_length +* target_length +* strategy + +This is the order the arguments are passed to the constructor if not using +named arguments. You'll need to read the Zstandard documentation for what these parameters do. +Frame Inspection +---------------- + +Data emitted from zstd compression is encapsulated in a *frame*. This frame +begins with a 4 byte *magic number* header followed by 2 to 14 bytes describing +the frame in more detail. For more info, see +https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md. + +``zstd.get_frame_parameters(data)`` parses a zstd *frame* header from a bytes +instance and return a ``FrameParameters`` object describing the frame. + +Depending on which fields are present in the frame and their values, the +length of the frame parameters varies. If insufficient bytes are passed +in to fully parse the frame parameters, ``ZstdError`` is raised. To ensure +frame parameters can be parsed, pass in at least 18 bytes. + +``FrameParameters`` instances have the following attributes: + +content_size + Integer size of original, uncompressed content. This will be ``0`` if the + original content size isn't written to the frame (controlled with the + ``write_content_size`` argument to ``ZstdCompressor``) or if the input + content size was ``0``. + +window_size + Integer size of maximum back-reference distance in compressed data. + +dict_id + Integer of dictionary ID used for compression. ``0`` if no dictionary + ID was used or if the dictionary ID was ``0``. + +has_checksum + Bool indicating whether a 4 byte content checksum is stored at the end + of the frame. + Misc Functionality ------------------ @@ -776,19 +877,32 @@ TARGETLENGTH_MAX Maximum value for compression parameter STRATEGY_FAST - Compression strategory + Compression strategy STRATEGY_DFAST - Compression strategory + Compression strategy STRATEGY_GREEDY - Compression strategory + Compression strategy STRATEGY_LAZY - Compression strategory + Compression strategy STRATEGY_LAZY2 - Compression strategory + Compression strategy STRATEGY_BTLAZY2 - Compression strategory + Compression strategy STRATEGY_BTOPT - Compression strategory + Compression strategy + +Performance Considerations +-------------------------- + +The ``ZstdCompressor`` and ``ZstdDecompressor`` types maintain state to a +persistent compression or decompression *context*. Reusing a ``ZstdCompressor`` +or ``ZstdDecompressor`` instance for multiple operations is faster than +instantiating a new ``ZstdCompressor`` or ``ZstdDecompressor`` for each +operation. The differences are magnified as the size of data decreases. For +example, the difference between *context* reuse and non-reuse for 100,000 +100 byte inputs will be significant (possiby over 10x faster to reuse contexts) +whereas 10 1,000,000 byte inputs will be more similar in speed (because the +time spent doing compression dwarfs time spent creating new *contexts*). Note on Zstandard's *Experimental* API ====================================== diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/compressiondict.c --- a/contrib/python-zstandard/c-ext/compressiondict.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/compressiondict.c Fri Mar 24 08:37:26 2017 -0700 @@ -28,7 +28,8 @@ void* dict; ZstdCompressionDict* result; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "nO!|O!", kwlist, + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "nO!|O!:train_dictionary", + kwlist, &capacity, &PyList_Type, &samples, (PyObject*)&DictParametersType, ¶meters)) { @@ -57,7 +58,6 @@ sampleItem = PyList_GetItem(samples, sampleIndex); if (!PyBytes_Check(sampleItem)) { PyErr_SetString(PyExc_ValueError, "samples must be bytes"); - /* TODO probably need to perform DECREF here */ return NULL; } samplesSize += PyBytes_GET_SIZE(sampleItem); @@ -133,10 +133,11 @@ self->dictSize = 0; #if PY_MAJOR_VERSION >= 3 - if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "y#:ZstdCompressionDict", #else - if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "s#:ZstdCompressionDict", #endif + &source, &sourceSize)) { return -1; } diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/compressionparams.c --- a/contrib/python-zstandard/c-ext/compressionparams.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/compressionparams.c Fri Mar 24 08:37:26 2017 -0700 @@ -25,7 +25,8 @@ ZSTD_compressionParameters params; CompressionParametersObject* result; - if (!PyArg_ParseTuple(args, "i|Kn", &compressionLevel, &sourceSize, &dictSize)) { + if (!PyArg_ParseTuple(args, "i|Kn:get_compression_parameters", + &compressionLevel, &sourceSize, &dictSize)) { return NULL; } @@ -47,12 +48,85 @@ return result; } +static int CompressionParameters_init(CompressionParametersObject* self, PyObject* args, PyObject* kwargs) { + static char* kwlist[] = { + "window_log", + "chain_log", + "hash_log", + "search_log", + "search_length", + "target_length", + "strategy", + NULL + }; + + unsigned windowLog; + unsigned chainLog; + unsigned hashLog; + unsigned searchLog; + unsigned searchLength; + unsigned targetLength; + unsigned strategy; + + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "IIIIIII:CompressionParameters", + kwlist, &windowLog, &chainLog, &hashLog, &searchLog, &searchLength, + &targetLength, &strategy)) { + return -1; + } + + if (windowLog < ZSTD_WINDOWLOG_MIN || windowLog > ZSTD_WINDOWLOG_MAX) { + PyErr_SetString(PyExc_ValueError, "invalid window log value"); + return -1; + } + + if (chainLog < ZSTD_CHAINLOG_MIN || chainLog > ZSTD_CHAINLOG_MAX) { + PyErr_SetString(PyExc_ValueError, "invalid chain log value"); + return -1; + } + + if (hashLog < ZSTD_HASHLOG_MIN || hashLog > ZSTD_HASHLOG_MAX) { + PyErr_SetString(PyExc_ValueError, "invalid hash log value"); + return -1; + } + + if (searchLog < ZSTD_SEARCHLOG_MIN || searchLog > ZSTD_SEARCHLOG_MAX) { + PyErr_SetString(PyExc_ValueError, "invalid search log value"); + return -1; + } + + if (searchLength < ZSTD_SEARCHLENGTH_MIN || searchLength > ZSTD_SEARCHLENGTH_MAX) { + PyErr_SetString(PyExc_ValueError, "invalid search length value"); + return -1; + } + + if (targetLength < ZSTD_TARGETLENGTH_MIN || targetLength > ZSTD_TARGETLENGTH_MAX) { + PyErr_SetString(PyExc_ValueError, "invalid target length value"); + return -1; + } + + if (strategy < ZSTD_fast || strategy > ZSTD_btopt) { + PyErr_SetString(PyExc_ValueError, "invalid strategy value"); + return -1; + } + + self->windowLog = windowLog; + self->chainLog = chainLog; + self->hashLog = hashLog; + self->searchLog = searchLog; + self->searchLength = searchLength; + self->targetLength = targetLength; + self->strategy = strategy; + + return 0; +} + PyObject* estimate_compression_context_size(PyObject* self, PyObject* args) { CompressionParametersObject* params; ZSTD_compressionParameters zparams; PyObject* result; - if (!PyArg_ParseTuple(args, "O!", &CompressionParametersType, ¶ms)) { + if (!PyArg_ParseTuple(args, "O!:estimate_compression_context_size", + &CompressionParametersType, ¶ms)) { return NULL; } @@ -64,113 +138,33 @@ PyDoc_STRVAR(CompressionParameters__doc__, "CompressionParameters: low-level control over zstd compression"); -static PyObject* CompressionParameters_new(PyTypeObject* subtype, PyObject* args, PyObject* kwargs) { - CompressionParametersObject* self; - unsigned windowLog; - unsigned chainLog; - unsigned hashLog; - unsigned searchLog; - unsigned searchLength; - unsigned targetLength; - unsigned strategy; - - if (!PyArg_ParseTuple(args, "IIIIIII", &windowLog, &chainLog, &hashLog, &searchLog, - &searchLength, &targetLength, &strategy)) { - return NULL; - } - - if (windowLog < ZSTD_WINDOWLOG_MIN || windowLog > ZSTD_WINDOWLOG_MAX) { - PyErr_SetString(PyExc_ValueError, "invalid window log value"); - return NULL; - } - - if (chainLog < ZSTD_CHAINLOG_MIN || chainLog > ZSTD_CHAINLOG_MAX) { - PyErr_SetString(PyExc_ValueError, "invalid chain log value"); - return NULL; - } - - if (hashLog < ZSTD_HASHLOG_MIN || hashLog > ZSTD_HASHLOG_MAX) { - PyErr_SetString(PyExc_ValueError, "invalid hash log value"); - return NULL; - } - - if (searchLog < ZSTD_SEARCHLOG_MIN || searchLog > ZSTD_SEARCHLOG_MAX) { - PyErr_SetString(PyExc_ValueError, "invalid search log value"); - return NULL; - } - - if (searchLength < ZSTD_SEARCHLENGTH_MIN || searchLength > ZSTD_SEARCHLENGTH_MAX) { - PyErr_SetString(PyExc_ValueError, "invalid search length value"); - return NULL; - } - - if (targetLength < ZSTD_TARGETLENGTH_MIN || targetLength > ZSTD_TARGETLENGTH_MAX) { - PyErr_SetString(PyExc_ValueError, "invalid target length value"); - return NULL; - } - - if (strategy < ZSTD_fast || strategy > ZSTD_btopt) { - PyErr_SetString(PyExc_ValueError, "invalid strategy value"); - return NULL; - } - - self = (CompressionParametersObject*)subtype->tp_alloc(subtype, 1); - if (!self) { - return NULL; - } - - self->windowLog = windowLog; - self->chainLog = chainLog; - self->hashLog = hashLog; - self->searchLog = searchLog; - self->searchLength = searchLength; - self->targetLength = targetLength; - self->strategy = strategy; - - return (PyObject*)self; -} - static void CompressionParameters_dealloc(PyObject* self) { PyObject_Del(self); } -static Py_ssize_t CompressionParameters_length(PyObject* self) { - return 7; -} - -static PyObject* CompressionParameters_item(PyObject* o, Py_ssize_t i) { - CompressionParametersObject* self = (CompressionParametersObject*)o; - - switch (i) { - case 0: - return PyLong_FromLong(self->windowLog); - case 1: - return PyLong_FromLong(self->chainLog); - case 2: - return PyLong_FromLong(self->hashLog); - case 3: - return PyLong_FromLong(self->searchLog); - case 4: - return PyLong_FromLong(self->searchLength); - case 5: - return PyLong_FromLong(self->targetLength); - case 6: - return PyLong_FromLong(self->strategy); - default: - PyErr_SetString(PyExc_IndexError, "index out of range"); - return NULL; - } -} - -static PySequenceMethods CompressionParameters_sq = { - CompressionParameters_length, /* sq_length */ - 0, /* sq_concat */ - 0, /* sq_repeat */ - CompressionParameters_item, /* sq_item */ - 0, /* sq_ass_item */ - 0, /* sq_contains */ - 0, /* sq_inplace_concat */ - 0 /* sq_inplace_repeat */ +static PyMemberDef CompressionParameters_members[] = { + { "window_log", T_UINT, + offsetof(CompressionParametersObject, windowLog), READONLY, + "window log" }, + { "chain_log", T_UINT, + offsetof(CompressionParametersObject, chainLog), READONLY, + "chain log" }, + { "hash_log", T_UINT, + offsetof(CompressionParametersObject, hashLog), READONLY, + "hash log" }, + { "search_log", T_UINT, + offsetof(CompressionParametersObject, searchLog), READONLY, + "search log" }, + { "search_length", T_UINT, + offsetof(CompressionParametersObject, searchLength), READONLY, + "search length" }, + { "target_length", T_UINT, + offsetof(CompressionParametersObject, targetLength), READONLY, + "target length" }, + { "strategy", T_INT, + offsetof(CompressionParametersObject, strategy), READONLY, + "strategy" }, + { NULL } }; PyTypeObject CompressionParametersType = { @@ -185,7 +179,7 @@ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ - &CompressionParameters_sq, /* tp_as_sequence */ + 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ 0, /* tp_call */ @@ -193,7 +187,7 @@ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ - Py_TPFLAGS_DEFAULT, /* tp_flags */ + Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */ CompressionParameters__doc__, /* tp_doc */ 0, /* tp_traverse */ 0, /* tp_clear */ @@ -202,16 +196,16 @@ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ - 0, /* tp_members */ + CompressionParameters_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ - 0, /* tp_init */ + (initproc)CompressionParameters_init, /* tp_init */ 0, /* tp_alloc */ - CompressionParameters_new, /* tp_new */ + PyType_GenericNew, /* tp_new */ }; void compressionparams_module_init(PyObject* mod) { diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/compressionwriter.c --- a/contrib/python-zstandard/c-ext/compressionwriter.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/compressionwriter.c Fri Mar 24 08:37:26 2017 -0700 @@ -52,7 +52,7 @@ ZSTD_outBuffer output; PyObject* res; - if (!PyArg_ParseTuple(args, "OOO", &exc_type, &exc_value, &exc_tb)) { + if (!PyArg_ParseTuple(args, "OOO:__exit__", &exc_type, &exc_value, &exc_tb)) { return NULL; } @@ -119,11 +119,12 @@ ZSTD_inBuffer input; ZSTD_outBuffer output; PyObject* res; + Py_ssize_t totalWrite = 0; #if PY_MAJOR_VERSION >= 3 - if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "y#:write", &source, &sourceSize)) { #else - if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "s#:write", &source, &sourceSize)) { #endif return NULL; } @@ -164,20 +165,21 @@ #endif output.dst, output.pos); Py_XDECREF(res); + totalWrite += output.pos; } output.pos = 0; } PyMem_Free(output.dst); - /* TODO return bytes written */ - Py_RETURN_NONE; + return PyLong_FromSsize_t(totalWrite); } static PyObject* ZstdCompressionWriter_flush(ZstdCompressionWriter* self, PyObject* args) { size_t zresult; ZSTD_outBuffer output; PyObject* res; + Py_ssize_t totalWrite = 0; if (!self->entered) { PyErr_SetString(ZstdError, "flush must be called from an active context manager"); @@ -215,14 +217,14 @@ #endif output.dst, output.pos); Py_XDECREF(res); + totalWrite += output.pos; } output.pos = 0; } PyMem_Free(output.dst); - /* TODO return bytes written */ - Py_RETURN_NONE; + return PyLong_FromSsize_t(totalWrite); } static PyMethodDef ZstdCompressionWriter_methods[] = { diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/compressobj.c --- a/contrib/python-zstandard/c-ext/compressobj.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/compressobj.c Fri Mar 24 08:37:26 2017 -0700 @@ -42,9 +42,9 @@ } #if PY_MAJOR_VERSION >= 3 - if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "y#:compress", &source, &sourceSize)) { #else - if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "s#:compress", &source, &sourceSize)) { #endif return NULL; } @@ -98,7 +98,7 @@ PyObject* result = NULL; Py_ssize_t resultSize = 0; - if (!PyArg_ParseTuple(args, "|i", &flushMode)) { + if (!PyArg_ParseTuple(args, "|i:flush", &flushMode)) { return NULL; } diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/compressor.c --- a/contrib/python-zstandard/c-ext/compressor.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/compressor.c Fri Mar 24 08:37:26 2017 -0700 @@ -16,7 +16,7 @@ Py_BEGIN_ALLOW_THREADS memset(&zmem, 0, sizeof(zmem)); compressor->cdict = ZSTD_createCDict_advanced(compressor->dict->dictData, - compressor->dict->dictSize, *zparams, zmem); + compressor->dict->dictSize, 1, *zparams, zmem); Py_END_ALLOW_THREADS if (!compressor->cdict) { @@ -128,8 +128,8 @@ self->cparams = NULL; self->cdict = NULL; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|iO!O!OOO", kwlist, - &level, &ZstdCompressionDictType, &dict, + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|iO!O!OOO:ZstdCompressor", + kwlist, &level, &ZstdCompressionDictType, &dict, &CompressionParametersType, ¶ms, &writeChecksum, &writeContentSize, &writeDictID)) { return -1; @@ -243,8 +243,8 @@ PyObject* totalReadPy; PyObject* totalWritePy; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO|nkk", kwlist, &source, &dest, &sourceSize, - &inSize, &outSize)) { + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO|nkk:copy_stream", kwlist, + &source, &dest, &sourceSize, &inSize, &outSize)) { return NULL; } @@ -402,9 +402,9 @@ ZSTD_parameters zparams; #if PY_MAJOR_VERSION >= 3 - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y#|O", + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y#|O:compress", #else - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s#|O", + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s#|O:compress", #endif kwlist, &source, &sourceSize, &allowEmpty)) { return NULL; @@ -512,7 +512,7 @@ return NULL; } - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n", kwlist, &inSize)) { + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|n:compressobj", kwlist, &inSize)) { return NULL; } @@ -574,8 +574,8 @@ size_t outSize = ZSTD_CStreamOutSize(); ZstdCompressorIterator* result; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|nkk", kwlist, &reader, &sourceSize, - &inSize, &outSize)) { + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|nkk:read_from", kwlist, + &reader, &sourceSize, &inSize, &outSize)) { return NULL; } @@ -693,8 +693,8 @@ Py_ssize_t sourceSize = 0; size_t outSize = ZSTD_CStreamOutSize(); - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|nk", kwlist, &writer, &sourceSize, - &outSize)) { + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|nk:write_to", kwlist, + &writer, &sourceSize, &outSize)) { return NULL; } diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/decompressionwriter.c --- a/contrib/python-zstandard/c-ext/decompressionwriter.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/decompressionwriter.c Fri Mar 24 08:37:26 2017 -0700 @@ -71,11 +71,12 @@ ZSTD_inBuffer input; ZSTD_outBuffer output; PyObject* res; + Py_ssize_t totalWrite = 0; #if PY_MAJOR_VERSION >= 3 - if (!PyArg_ParseTuple(args, "y#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "y#:write", &source, &sourceSize)) { #else - if (!PyArg_ParseTuple(args, "s#", &source, &sourceSize)) { + if (!PyArg_ParseTuple(args, "s#:write", &source, &sourceSize)) { #endif return NULL; } @@ -116,15 +117,15 @@ #endif output.dst, output.pos); Py_XDECREF(res); + totalWrite += output.pos; output.pos = 0; } } PyMem_Free(output.dst); - /* TODO return bytes written */ - Py_RETURN_NONE; - } + return PyLong_FromSsize_t(totalWrite); +} static PyMethodDef ZstdDecompressionWriter_methods[] = { { "__enter__", (PyCFunction)ZstdDecompressionWriter_enter, METH_NOARGS, diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/decompressobj.c --- a/contrib/python-zstandard/c-ext/decompressobj.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/decompressobj.c Fri Mar 24 08:37:26 2017 -0700 @@ -41,9 +41,9 @@ } #if PY_MAJOR_VERSION >= 3 - if (!PyArg_ParseTuple(args, "y#", + if (!PyArg_ParseTuple(args, "y#:decompress", #else - if (!PyArg_ParseTuple(args, "s#", + if (!PyArg_ParseTuple(args, "s#:decompress", #endif &source, &sourceSize)) { return NULL; diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/decompressor.c --- a/contrib/python-zstandard/c-ext/decompressor.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/decompressor.c Fri Mar 24 08:37:26 2017 -0700 @@ -59,23 +59,19 @@ ZstdCompressionDict* dict = NULL; - self->refdctx = NULL; + self->dctx = NULL; self->dict = NULL; self->ddict = NULL; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|O!", kwlist, + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|O!:ZstdDecompressor", kwlist, &ZstdCompressionDictType, &dict)) { return -1; } - /* Instead of creating a ZSTD_DCtx for every decompression operation, - we create an instance at object creation time and recycle it via - ZSTD_copyDCTx() on each use. This means each use is a malloc+memcpy - instead of a malloc+init. */ /* TODO lazily initialize the reference ZSTD_DCtx on first use since not instances of ZstdDecompressor will use a ZSTD_DCtx. */ - self->refdctx = ZSTD_createDCtx(); - if (!self->refdctx) { + self->dctx = ZSTD_createDCtx(); + if (!self->dctx) { PyErr_NoMemory(); goto except; } @@ -88,17 +84,17 @@ return 0; except: - if (self->refdctx) { - ZSTD_freeDCtx(self->refdctx); - self->refdctx = NULL; + if (self->dctx) { + ZSTD_freeDCtx(self->dctx); + self->dctx = NULL; } return -1; } static void Decompressor_dealloc(ZstdDecompressor* self) { - if (self->refdctx) { - ZSTD_freeDCtx(self->refdctx); + if (self->dctx) { + ZSTD_freeDCtx(self->dctx); } Py_XDECREF(self->dict); @@ -150,8 +146,8 @@ PyObject* totalReadPy; PyObject* totalWritePy; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO|kk", kwlist, &source, - &dest, &inSize, &outSize)) { + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO|kk:copy_stream", kwlist, + &source, &dest, &inSize, &outSize)) { return NULL; } @@ -243,7 +239,7 @@ Py_DecRef(totalReadPy); Py_DecRef(totalWritePy); - finally: +finally: if (output.dst) { PyMem_Free(output.dst); } @@ -291,28 +287,19 @@ unsigned long long decompressedSize; size_t destCapacity; PyObject* result = NULL; - ZSTD_DCtx* dctx = NULL; void* dictData = NULL; size_t dictSize = 0; size_t zresult; #if PY_MAJOR_VERSION >= 3 - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y#|n", kwlist, + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y#|n:decompress", #else - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s#|n", kwlist, + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s#|n:decompress", #endif - &source, &sourceSize, &maxOutputSize)) { + kwlist, &source, &sourceSize, &maxOutputSize)) { return NULL; } - dctx = PyMem_Malloc(ZSTD_sizeof_DCtx(self->refdctx)); - if (!dctx) { - PyErr_NoMemory(); - return NULL; - } - - ZSTD_copyDCtx(dctx, self->refdctx); - if (self->dict) { dictData = self->dict->dictData; dictSize = self->dict->dictSize; @@ -320,12 +307,12 @@ if (dictData && !self->ddict) { Py_BEGIN_ALLOW_THREADS - self->ddict = ZSTD_createDDict(dictData, dictSize); + self->ddict = ZSTD_createDDict_byReference(dictData, dictSize); Py_END_ALLOW_THREADS if (!self->ddict) { PyErr_SetString(ZstdError, "could not create decompression dict"); - goto except; + return NULL; } } @@ -335,7 +322,7 @@ if (0 == maxOutputSize) { PyErr_SetString(ZstdError, "input data invalid or missing content size " "in frame header"); - goto except; + return NULL; } else { result = PyBytes_FromStringAndSize(NULL, maxOutputSize); @@ -348,45 +335,39 @@ } if (!result) { - goto except; + return NULL; } Py_BEGIN_ALLOW_THREADS if (self->ddict) { - zresult = ZSTD_decompress_usingDDict(dctx, PyBytes_AsString(result), destCapacity, + zresult = ZSTD_decompress_usingDDict(self->dctx, + PyBytes_AsString(result), destCapacity, source, sourceSize, self->ddict); } else { - zresult = ZSTD_decompressDCtx(dctx, PyBytes_AsString(result), destCapacity, source, sourceSize); + zresult = ZSTD_decompressDCtx(self->dctx, + PyBytes_AsString(result), destCapacity, source, sourceSize); } Py_END_ALLOW_THREADS if (ZSTD_isError(zresult)) { PyErr_Format(ZstdError, "decompression error: %s", ZSTD_getErrorName(zresult)); - goto except; + Py_DecRef(result); + return NULL; } else if (decompressedSize && zresult != decompressedSize) { PyErr_Format(ZstdError, "decompression error: decompressed %zu bytes; expected %llu", zresult, decompressedSize); - goto except; + Py_DecRef(result); + return NULL; } else if (zresult < destCapacity) { if (_PyBytes_Resize(&result, zresult)) { - goto except; + Py_DecRef(result); + return NULL; } } - goto finally; - -except: - Py_DecRef(result); - result = NULL; - -finally: - if (dctx) { - PyMem_FREE(dctx); - } - return result; } @@ -455,8 +436,8 @@ ZstdDecompressorIterator* result; size_t skipBytes = 0; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kkk", kwlist, &reader, - &inSize, &outSize, &skipBytes)) { + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|kkk:read_from", kwlist, + &reader, &inSize, &outSize, &skipBytes)) { return NULL; } @@ -534,19 +515,14 @@ goto finally; except: - if (result->reader) { - Py_DECREF(result->reader); - result->reader = NULL; - } + Py_CLEAR(result->reader); if (result->buffer) { PyBuffer_Release(result->buffer); - Py_DECREF(result->buffer); - result->buffer = NULL; + Py_CLEAR(result->buffer); } - Py_DECREF(result); - result = NULL; + Py_CLEAR(result); finally: @@ -577,7 +553,8 @@ size_t outSize = ZSTD_DStreamOutSize(); ZstdDecompressionWriter* result; - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k", kwlist, &writer, &outSize)) { + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|k:write_to", kwlist, + &writer, &outSize)) { return NULL; } @@ -605,6 +582,200 @@ return result; } +PyDoc_STRVAR(Decompressor_decompress_content_dict_chain__doc__, +"Decompress a series of chunks using the content dictionary chaining technique\n" +); + +static PyObject* Decompressor_decompress_content_dict_chain(PyObject* self, PyObject* args, PyObject* kwargs) { + static char* kwlist[] = { + "frames", + NULL + }; + + PyObject* chunks; + Py_ssize_t chunksLen; + Py_ssize_t chunkIndex; + char parity = 0; + PyObject* chunk; + char* chunkData; + Py_ssize_t chunkSize; + ZSTD_DCtx* dctx = NULL; + size_t zresult; + ZSTD_frameParams frameParams; + void* buffer1 = NULL; + size_t buffer1Size = 0; + size_t buffer1ContentSize = 0; + void* buffer2 = NULL; + size_t buffer2Size = 0; + size_t buffer2ContentSize = 0; + void* destBuffer = NULL; + PyObject* result = NULL; + + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O!:decompress_content_dict_chain", + kwlist, &PyList_Type, &chunks)) { + return NULL; + } + + chunksLen = PyList_Size(chunks); + if (!chunksLen) { + PyErr_SetString(PyExc_ValueError, "empty input chain"); + return NULL; + } + + /* The first chunk should not be using a dictionary. We handle it specially. */ + chunk = PyList_GetItem(chunks, 0); + if (!PyBytes_Check(chunk)) { + PyErr_SetString(PyExc_ValueError, "chunk 0 must be bytes"); + return NULL; + } + + /* We require that all chunks be zstd frames and that they have content size set. */ + PyBytes_AsStringAndSize(chunk, &chunkData, &chunkSize); + zresult = ZSTD_getFrameParams(&frameParams, (void*)chunkData, chunkSize); + if (ZSTD_isError(zresult)) { + PyErr_SetString(PyExc_ValueError, "chunk 0 is not a valid zstd frame"); + return NULL; + } + else if (zresult) { + PyErr_SetString(PyExc_ValueError, "chunk 0 is too small to contain a zstd frame"); + return NULL; + } + + if (0 == frameParams.frameContentSize) { + PyErr_SetString(PyExc_ValueError, "chunk 0 missing content size in frame"); + return NULL; + } + + dctx = ZSTD_createDCtx(); + if (!dctx) { + PyErr_NoMemory(); + goto finally; + } + + buffer1Size = frameParams.frameContentSize; + buffer1 = PyMem_Malloc(buffer1Size); + if (!buffer1) { + goto finally; + } + + Py_BEGIN_ALLOW_THREADS + zresult = ZSTD_decompressDCtx(dctx, buffer1, buffer1Size, chunkData, chunkSize); + Py_END_ALLOW_THREADS + if (ZSTD_isError(zresult)) { + PyErr_Format(ZstdError, "could not decompress chunk 0: %s", ZSTD_getErrorName(zresult)); + goto finally; + } + + buffer1ContentSize = zresult; + + /* Special case of a simple chain. */ + if (1 == chunksLen) { + result = PyBytes_FromStringAndSize(buffer1, buffer1Size); + goto finally; + } + + /* This should ideally look at next chunk. But this is slightly simpler. */ + buffer2Size = frameParams.frameContentSize; + buffer2 = PyMem_Malloc(buffer2Size); + if (!buffer2) { + goto finally; + } + + /* For each subsequent chunk, use the previous fulltext as a content dictionary. + Our strategy is to have 2 buffers. One holds the previous fulltext (to be + used as a content dictionary) and the other holds the new fulltext. The + buffers grow when needed but never decrease in size. This limits the + memory allocator overhead. + */ + for (chunkIndex = 1; chunkIndex < chunksLen; chunkIndex++) { + chunk = PyList_GetItem(chunks, chunkIndex); + if (!PyBytes_Check(chunk)) { + PyErr_Format(PyExc_ValueError, "chunk %zd must be bytes", chunkIndex); + goto finally; + } + + PyBytes_AsStringAndSize(chunk, &chunkData, &chunkSize); + zresult = ZSTD_getFrameParams(&frameParams, (void*)chunkData, chunkSize); + if (ZSTD_isError(zresult)) { + PyErr_Format(PyExc_ValueError, "chunk %zd is not a valid zstd frame", chunkIndex); + goto finally; + } + else if (zresult) { + PyErr_Format(PyExc_ValueError, "chunk %zd is too small to contain a zstd frame", chunkIndex); + goto finally; + } + + if (0 == frameParams.frameContentSize) { + PyErr_Format(PyExc_ValueError, "chunk %zd missing content size in frame", chunkIndex); + goto finally; + } + + parity = chunkIndex % 2; + + /* This could definitely be abstracted to reduce code duplication. */ + if (parity) { + /* Resize destination buffer to hold larger content. */ + if (buffer2Size < frameParams.frameContentSize) { + buffer2Size = frameParams.frameContentSize; + destBuffer = PyMem_Realloc(buffer2, buffer2Size); + if (!destBuffer) { + goto finally; + } + buffer2 = destBuffer; + } + + Py_BEGIN_ALLOW_THREADS + zresult = ZSTD_decompress_usingDict(dctx, buffer2, buffer2Size, + chunkData, chunkSize, buffer1, buffer1ContentSize); + Py_END_ALLOW_THREADS + if (ZSTD_isError(zresult)) { + PyErr_Format(ZstdError, "could not decompress chunk %zd: %s", + chunkIndex, ZSTD_getErrorName(zresult)); + goto finally; + } + buffer2ContentSize = zresult; + } + else { + if (buffer1Size < frameParams.frameContentSize) { + buffer1Size = frameParams.frameContentSize; + destBuffer = PyMem_Realloc(buffer1, buffer1Size); + if (!destBuffer) { + goto finally; + } + buffer1 = destBuffer; + } + + Py_BEGIN_ALLOW_THREADS + zresult = ZSTD_decompress_usingDict(dctx, buffer1, buffer1Size, + chunkData, chunkSize, buffer2, buffer2ContentSize); + Py_END_ALLOW_THREADS + if (ZSTD_isError(zresult)) { + PyErr_Format(ZstdError, "could not decompress chunk %zd: %s", + chunkIndex, ZSTD_getErrorName(zresult)); + goto finally; + } + buffer1ContentSize = zresult; + } + } + + result = PyBytes_FromStringAndSize(parity ? buffer2 : buffer1, + parity ? buffer2ContentSize : buffer1ContentSize); + +finally: + if (buffer2) { + PyMem_Free(buffer2); + } + if (buffer1) { + PyMem_Free(buffer1); + } + + if (dctx) { + ZSTD_freeDCtx(dctx); + } + + return result; +} + static PyMethodDef Decompressor_methods[] = { { "copy_stream", (PyCFunction)Decompressor_copy_stream, METH_VARARGS | METH_KEYWORDS, Decompressor_copy_stream__doc__ }, @@ -616,6 +787,8 @@ Decompressor_read_from__doc__ }, { "write_to", (PyCFunction)Decompressor_write_to, METH_VARARGS | METH_KEYWORDS, Decompressor_write_to__doc__ }, + { "decompress_content_dict_chain", (PyCFunction)Decompressor_decompress_content_dict_chain, + METH_VARARGS | METH_KEYWORDS, Decompressor_decompress_content_dict_chain__doc__ }, { NULL, NULL } }; diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/dictparams.c --- a/contrib/python-zstandard/c-ext/dictparams.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/dictparams.c Fri Mar 24 08:37:26 2017 -0700 @@ -18,8 +18,8 @@ unsigned notificationLevel; unsigned dictID; - if (!PyArg_ParseTuple(args, "IiII", &selectivityLevel, &compressionLevel, - ¬ificationLevel, &dictID)) { + if (!PyArg_ParseTuple(args, "IiII:DictParameters", + &selectivityLevel, &compressionLevel, ¬ificationLevel, &dictID)) { return NULL; } @@ -40,6 +40,22 @@ PyObject_Del(self); } +static PyMemberDef DictParameters_members[] = { + { "selectivity_level", T_UINT, + offsetof(DictParametersObject, selectivityLevel), READONLY, + "selectivity level" }, + { "compression_level", T_INT, + offsetof(DictParametersObject, compressionLevel), READONLY, + "compression level" }, + { "notification_level", T_UINT, + offsetof(DictParametersObject, notificationLevel), READONLY, + "notification level" }, + { "dict_id", T_UINT, + offsetof(DictParametersObject, dictID), READONLY, + "dictionary ID" }, + { NULL } +}; + static Py_ssize_t DictParameters_length(PyObject* self) { return 4; } @@ -102,7 +118,7 @@ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ - 0, /* tp_members */ + DictParameters_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/frameparams.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/c-ext/frameparams.c Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,132 @@ +/** +* Copyright (c) 2017-present, Gregory Szorc +* All rights reserved. +* +* This software may be modified and distributed under the terms +* of the BSD license. See the LICENSE file for details. +*/ + +#include "python-zstandard.h" + +extern PyObject* ZstdError; + +PyDoc_STRVAR(FrameParameters__doc__, + "FrameParameters: information about a zstd frame"); + +FrameParametersObject* get_frame_parameters(PyObject* self, PyObject* args) { + const char* source; + Py_ssize_t sourceSize; + ZSTD_frameParams params; + FrameParametersObject* result = NULL; + size_t zresult; + +#if PY_MAJOR_VERSION >= 3 + if (!PyArg_ParseTuple(args, "y#:get_frame_parameters", +#else + if (!PyArg_ParseTuple(args, "s#:get_frame_parameters", +#endif + &source, &sourceSize)) { + return NULL; + } + + /* Needed for Python 2 to reject unicode */ + if (!PyBytes_Check(PyTuple_GET_ITEM(args, 0))) { + PyErr_SetString(PyExc_TypeError, "argument must be bytes"); + return NULL; + } + + zresult = ZSTD_getFrameParams(¶ms, (void*)source, sourceSize); + + if (ZSTD_isError(zresult)) { + PyErr_Format(ZstdError, "cannot get frame parameters: %s", ZSTD_getErrorName(zresult)); + return NULL; + } + + if (zresult) { + PyErr_Format(ZstdError, "not enough data for frame parameters; need %zu bytes", zresult); + return NULL; + } + + result = PyObject_New(FrameParametersObject, &FrameParametersType); + if (!result) { + return NULL; + } + + result->frameContentSize = params.frameContentSize; + result->windowSize = params.windowSize; + result->dictID = params.dictID; + result->checksumFlag = params.checksumFlag ? 1 : 0; + + return result; +} + +static void FrameParameters_dealloc(PyObject* self) { + PyObject_Del(self); +} + +static PyMemberDef FrameParameters_members[] = { + { "content_size", T_ULONGLONG, + offsetof(FrameParametersObject, frameContentSize), READONLY, + "frame content size" }, + { "window_size", T_UINT, + offsetof(FrameParametersObject, windowSize), READONLY, + "window size" }, + { "dict_id", T_UINT, + offsetof(FrameParametersObject, dictID), READONLY, + "dictionary ID" }, + { "has_checksum", T_BOOL, + offsetof(FrameParametersObject, checksumFlag), READONLY, + "checksum flag" }, + { NULL } +}; + +PyTypeObject FrameParametersType = { + PyVarObject_HEAD_INIT(NULL, 0) + "FrameParameters", /* tp_name */ + sizeof(FrameParametersObject), /* tp_basicsize */ + 0, /* tp_itemsize */ + (destructor)FrameParameters_dealloc, /* tp_dealloc */ + 0, /* tp_print */ + 0, /* tp_getattr */ + 0, /* tp_setattr */ + 0, /* tp_compare */ + 0, /* tp_repr */ + 0, /* tp_as_number */ + 0, /* tp_as_sequence */ + 0, /* tp_as_mapping */ + 0, /* tp_hash */ + 0, /* tp_call */ + 0, /* tp_str */ + 0, /* tp_getattro */ + 0, /* tp_setattro */ + 0, /* tp_as_buffer */ + Py_TPFLAGS_DEFAULT, /* tp_flags */ + FrameParameters__doc__, /* tp_doc */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + 0, /* tp_methods */ + FrameParameters_members, /* tp_members */ + 0, /* tp_getset */ + 0, /* tp_base */ + 0, /* tp_dict */ + 0, /* tp_descr_get */ + 0, /* tp_descr_set */ + 0, /* tp_dictoffset */ + 0, /* tp_init */ + 0, /* tp_alloc */ + 0, /* tp_new */ +}; + +void frameparams_module_init(PyObject* mod) { + Py_TYPE(&FrameParametersType) = &PyType_Type; + if (PyType_Ready(&FrameParametersType) < 0) { + return; + } + + Py_IncRef((PyObject*)&FrameParametersType); + PyModule_AddObject(mod, "FrameParameters", (PyObject*)&FrameParametersType); +} diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/c-ext/python-zstandard.h --- a/contrib/python-zstandard/c-ext/python-zstandard.h Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/c-ext/python-zstandard.h Fri Mar 24 08:37:26 2017 -0700 @@ -8,6 +8,7 @@ #define PY_SSIZE_T_CLEAN #include +#include "structmember.h" #define ZSTD_STATIC_LINKING_ONLY #define ZDICT_STATIC_LINKING_ONLY @@ -15,7 +16,7 @@ #include "zstd.h" #include "zdict.h" -#define PYTHON_ZSTANDARD_VERSION "0.6.0" +#define PYTHON_ZSTANDARD_VERSION "0.7.0" typedef enum { compressorobj_flush_finish, @@ -37,6 +38,16 @@ typedef struct { PyObject_HEAD + unsigned long long frameContentSize; + unsigned windowSize; + unsigned dictID; + char checksumFlag; +} FrameParametersObject; + +extern PyTypeObject FrameParametersType; + +typedef struct { + PyObject_HEAD unsigned selectivityLevel; int compressionLevel; unsigned notificationLevel; @@ -115,7 +126,7 @@ typedef struct { PyObject_HEAD - ZSTD_DCtx* refdctx; + ZSTD_DCtx* dctx; ZstdCompressionDict* dict; ZSTD_DDict* ddict; @@ -172,6 +183,7 @@ void ztopy_compression_parameters(CompressionParametersObject* params, ZSTD_compressionParameters* zparams); CompressionParametersObject* get_compression_parameters(PyObject* self, PyObject* args); +FrameParametersObject* get_frame_parameters(PyObject* self, PyObject* args); PyObject* estimate_compression_context_size(PyObject* self, PyObject* args); ZSTD_CStream* CStream_from_ZstdCompressor(ZstdCompressor* compressor, Py_ssize_t sourceSize); ZSTD_DStream* DStream_from_ZstdDecompressor(ZstdDecompressor* decompressor); diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/make_cffi.py --- a/contrib/python-zstandard/make_cffi.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/make_cffi.py Fri Mar 24 08:37:26 2017 -0700 @@ -9,6 +9,7 @@ import cffi import distutils.ccompiler import os +import re import subprocess import tempfile @@ -19,6 +20,8 @@ 'common/entropy_common.c', 'common/error_private.c', 'common/fse_decompress.c', + 'common/pool.c', + 'common/threading.c', 'common/xxhash.c', 'common/zstd_common.c', 'compress/fse_compress.c', @@ -26,10 +29,17 @@ 'compress/zstd_compress.c', 'decompress/huf_decompress.c', 'decompress/zstd_decompress.c', + 'dictBuilder/cover.c', 'dictBuilder/divsufsort.c', 'dictBuilder/zdict.c', )] +HEADERS = [os.path.join(HERE, 'zstd', *p) for p in ( + ('zstd.h',), + ('common', 'pool.h'), + ('dictBuilder', 'zdict.h'), +)] + INCLUDE_DIRS = [os.path.join(HERE, d) for d in ( 'zstd', 'zstd/common', @@ -53,56 +63,92 @@ args.extend([ '-E', '-DZSTD_STATIC_LINKING_ONLY', + '-DZDICT_STATIC_LINKING_ONLY', ]) elif compiler.compiler_type == 'msvc': args = [compiler.cc] args.extend([ '/EP', '/DZSTD_STATIC_LINKING_ONLY', + '/DZDICT_STATIC_LINKING_ONLY', ]) else: raise Exception('unsupported compiler type: %s' % compiler.compiler_type) -# zstd.h includes , which is also included by cffi's boilerplate. -# This can lead to duplicate declarations. So we strip this include from the -# preprocessor invocation. +def preprocess(path): + # zstd.h includes , which is also included by cffi's boilerplate. + # This can lead to duplicate declarations. So we strip this include from the + # preprocessor invocation. + with open(path, 'rb') as fh: + lines = [l for l in fh if not l.startswith(b'#include ')] -with open(os.path.join(HERE, 'zstd', 'zstd.h'), 'rb') as fh: - lines = [l for l in fh if not l.startswith(b'#include ')] - -fd, input_file = tempfile.mkstemp(suffix='.h') -os.write(fd, b''.join(lines)) -os.close(fd) + fd, input_file = tempfile.mkstemp(suffix='.h') + os.write(fd, b''.join(lines)) + os.close(fd) -args.append(input_file) + try: + process = subprocess.Popen(args + [input_file], stdout=subprocess.PIPE) + output = process.communicate()[0] + ret = process.poll() + if ret: + raise Exception('preprocessor exited with error') -try: - process = subprocess.Popen(args, stdout=subprocess.PIPE) - output = process.communicate()[0] - ret = process.poll() - if ret: - raise Exception('preprocessor exited with error') -finally: - os.unlink(input_file) + return output + finally: + os.unlink(input_file) -def normalize_output(): + +def normalize_output(output): lines = [] for line in output.splitlines(): # CFFI's parser doesn't like __attribute__ on UNIX compilers. if line.startswith(b'__attribute__ ((visibility ("default"))) '): line = line[len(b'__attribute__ ((visibility ("default"))) '):] + if line.startswith(b'__attribute__((deprecated('): + continue + elif b'__declspec(deprecated(' in line: + continue + lines.append(line) return b'\n'.join(lines) + ffi = cffi.FFI() ffi.set_source('_zstd_cffi', ''' +#include "mem.h" #define ZSTD_STATIC_LINKING_ONLY #include "zstd.h" +#define ZDICT_STATIC_LINKING_ONLY +#include "pool.h" +#include "zdict.h" ''', sources=SOURCES, include_dirs=INCLUDE_DIRS) -ffi.cdef(normalize_output().decode('latin1')) +DEFINE = re.compile(b'^\\#define ([a-zA-Z0-9_]+) ') + +sources = [] + +for header in HEADERS: + preprocessed = preprocess(header) + sources.append(normalize_output(preprocessed)) + + # Do another pass over source and find constants that were preprocessed + # away. + with open(header, 'rb') as fh: + for line in fh: + line = line.strip() + m = DEFINE.match(line) + if not m: + continue + + # The parser doesn't like some constants with complex values. + if m.group(1) in (b'ZSTD_LIB_VERSION', b'ZSTD_VERSION_STRING'): + continue + + sources.append(m.group(0) + b' ...') + +ffi.cdef(u'\n'.join(s.decode('latin1') for s in sources)) if __name__ == '__main__': ffi.compile() diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/setup.py --- a/contrib/python-zstandard/setup.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/setup.py Fri Mar 24 08:37:26 2017 -0700 @@ -62,6 +62,7 @@ 'Programming Language :: Python :: 3.3', 'Programming Language :: Python :: 3.4', 'Programming Language :: Python :: 3.5', + 'Programming Language :: Python :: 3.6', ], keywords='zstandard zstd compression', ext_modules=extensions, diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/setup_zstd.py --- a/contrib/python-zstandard/setup_zstd.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/setup_zstd.py Fri Mar 24 08:37:26 2017 -0700 @@ -12,6 +12,8 @@ 'common/entropy_common.c', 'common/error_private.c', 'common/fse_decompress.c', + 'common/pool.c', + 'common/threading.c', 'common/xxhash.c', 'common/zstd_common.c', 'compress/fse_compress.c', @@ -19,11 +21,13 @@ 'compress/zstd_compress.c', 'decompress/huf_decompress.c', 'decompress/zstd_decompress.c', + 'dictBuilder/cover.c', 'dictBuilder/divsufsort.c', 'dictBuilder/zdict.c', )] zstd_sources_legacy = ['zstd/%s' % p for p in ( + 'deprecated/zbuff_common.c', 'deprecated/zbuff_compress.c', 'deprecated/zbuff_decompress.c', 'legacy/zstd_v01.c', @@ -63,6 +67,7 @@ 'c-ext/decompressoriterator.c', 'c-ext/decompressionwriter.c', 'c-ext/dictparams.c', + 'c-ext/frameparams.c', ] zstd_depends = [ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/common.py --- a/contrib/python-zstandard/tests/common.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/common.py Fri Mar 24 08:37:26 2017 -0700 @@ -1,4 +1,50 @@ +import inspect import io +import types + + +def make_cffi(cls): + """Decorator to add CFFI versions of each test method.""" + + try: + import zstd_cffi + except ImportError: + return cls + + # If CFFI version is available, dynamically construct test methods + # that use it. + + for attr in dir(cls): + fn = getattr(cls, attr) + if not inspect.ismethod(fn) and not inspect.isfunction(fn): + continue + + if not fn.__name__.startswith('test_'): + continue + + name = '%s_cffi' % fn.__name__ + + # Replace the "zstd" symbol with the CFFI module instance. Then copy + # the function object and install it in a new attribute. + if isinstance(fn, types.FunctionType): + globs = dict(fn.__globals__) + globs['zstd'] = zstd_cffi + new_fn = types.FunctionType(fn.__code__, globs, name, + fn.__defaults__, fn.__closure__) + new_method = new_fn + else: + globs = dict(fn.__func__.func_globals) + globs['zstd'] = zstd_cffi + new_fn = types.FunctionType(fn.__func__.func_code, globs, name, + fn.__func__.func_defaults, + fn.__func__.func_closure) + new_method = types.UnboundMethodType(new_fn, fn.im_self, + fn.im_class) + + setattr(cls, name, new_method) + + return cls + class OpCountingBytesIO(io.BytesIO): def __init__(self, *args, **kwargs): diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_cffi.py --- a/contrib/python-zstandard/tests/test_cffi.py Thu Mar 23 19:54:59 2017 -0700 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,35 +0,0 @@ -import io - -try: - import unittest2 as unittest -except ImportError: - import unittest - -import zstd - -try: - import zstd_cffi -except ImportError: - raise unittest.SkipTest('cffi version of zstd not available') - - -class TestCFFIWriteToToCDecompressor(unittest.TestCase): - def test_simple(self): - orig = io.BytesIO() - orig.write(b'foo') - orig.write(b'bar') - orig.write(b'foobar' * 16384) - - dest = io.BytesIO() - cctx = zstd_cffi.ZstdCompressor() - with cctx.write_to(dest) as compressor: - compressor.write(orig.getvalue()) - - uncompressed = io.BytesIO() - dctx = zstd.ZstdDecompressor() - with dctx.write_to(uncompressed) as decompressor: - decompressor.write(dest.getvalue()) - - self.assertEqual(uncompressed.getvalue(), orig.getvalue()) - - diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_compressor.py --- a/contrib/python-zstandard/tests/test_compressor.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/test_compressor.py Fri Mar 24 08:37:26 2017 -0700 @@ -10,7 +10,10 @@ import zstd -from .common import OpCountingBytesIO +from .common import ( + make_cffi, + OpCountingBytesIO, +) if sys.version_info[0] >= 3: @@ -19,6 +22,7 @@ next = lambda it: it.next() +@make_cffi class TestCompressor(unittest.TestCase): def test_level_bounds(self): with self.assertRaises(ValueError): @@ -28,18 +32,17 @@ zstd.ZstdCompressor(level=23) +@make_cffi class TestCompressor_compress(unittest.TestCase): def test_compress_empty(self): cctx = zstd.ZstdCompressor(level=1) - cctx.compress(b'') - - cctx = zstd.ZstdCompressor(level=22) - cctx.compress(b'') - - def test_compress_empty(self): - cctx = zstd.ZstdCompressor(level=1) - self.assertEqual(cctx.compress(b''), - b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00') + result = cctx.compress(b'') + self.assertEqual(result, b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00') + params = zstd.get_frame_parameters(result) + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 524288) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum, 0) # TODO should be temporary until https://github.com/facebook/zstd/issues/506 # is fixed. @@ -59,6 +62,13 @@ self.assertEqual(len(result), 999) self.assertEqual(result[0:4], b'\x28\xb5\x2f\xfd') + # This matches the test for read_from() below. + cctx = zstd.ZstdCompressor(level=1) + result = cctx.compress(b'f' * zstd.COMPRESSION_RECOMMENDED_INPUT_SIZE + b'o') + self.assertEqual(result, b'\x28\xb5\x2f\xfd\x00\x40\x54\x00\x00' + b'\x10\x66\x66\x01\x00\xfb\xff\x39\xc0' + b'\x02\x09\x00\x00\x6f') + def test_write_checksum(self): cctx = zstd.ZstdCompressor(level=1) no_checksum = cctx.compress(b'foobar') @@ -67,6 +77,12 @@ self.assertEqual(len(with_checksum), len(no_checksum) + 4) + no_params = zstd.get_frame_parameters(no_checksum) + with_params = zstd.get_frame_parameters(with_checksum) + + self.assertFalse(no_params.has_checksum) + self.assertTrue(with_params.has_checksum) + def test_write_content_size(self): cctx = zstd.ZstdCompressor(level=1) no_size = cctx.compress(b'foobar' * 256) @@ -75,6 +91,11 @@ self.assertEqual(len(with_size), len(no_size) + 1) + no_params = zstd.get_frame_parameters(no_size) + with_params = zstd.get_frame_parameters(with_size) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 1536) + def test_no_dict_id(self): samples = [] for i in range(128): @@ -92,6 +113,11 @@ self.assertEqual(len(with_dict_id), len(no_dict_id) + 4) + no_params = zstd.get_frame_parameters(no_dict_id) + with_params = zstd.get_frame_parameters(with_dict_id) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, 1584102229) + def test_compress_dict_multiple(self): samples = [] for i in range(128): @@ -107,6 +133,7 @@ cctx.compress(b'foo bar foobar foo bar foobar') +@make_cffi class TestCompressor_compressobj(unittest.TestCase): def test_compressobj_empty(self): cctx = zstd.ZstdCompressor(level=1) @@ -127,6 +154,12 @@ self.assertEqual(len(result), 999) self.assertEqual(result[0:4], b'\x28\xb5\x2f\xfd') + params = zstd.get_frame_parameters(result) + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 1048576) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) + def test_write_checksum(self): cctx = zstd.ZstdCompressor(level=1) cobj = cctx.compressobj() @@ -135,6 +168,15 @@ cobj = cctx.compressobj() with_checksum = cobj.compress(b'foobar') + cobj.flush() + no_params = zstd.get_frame_parameters(no_checksum) + with_params = zstd.get_frame_parameters(with_checksum) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 0) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, 0) + self.assertFalse(no_params.has_checksum) + self.assertTrue(with_params.has_checksum) + self.assertEqual(len(with_checksum), len(no_checksum) + 4) def test_write_content_size(self): @@ -145,6 +187,15 @@ cobj = cctx.compressobj(size=len(b'foobar' * 256)) with_size = cobj.compress(b'foobar' * 256) + cobj.flush() + no_params = zstd.get_frame_parameters(no_size) + with_params = zstd.get_frame_parameters(with_size) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 1536) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, 0) + self.assertFalse(no_params.has_checksum) + self.assertFalse(with_params.has_checksum) + self.assertEqual(len(with_size), len(no_size) + 1) def test_compress_after_finished(self): @@ -187,6 +238,7 @@ self.assertEqual(header, b'\x01\x00\x00') +@make_cffi class TestCompressor_copy_stream(unittest.TestCase): def test_no_read(self): source = object() @@ -229,6 +281,12 @@ self.assertEqual(r, 255 * 16384) self.assertEqual(w, 999) + params = zstd.get_frame_parameters(dest.getvalue()) + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 1048576) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) + def test_write_checksum(self): source = io.BytesIO(b'foobar') no_checksum = io.BytesIO() @@ -244,6 +302,15 @@ self.assertEqual(len(with_checksum.getvalue()), len(no_checksum.getvalue()) + 4) + no_params = zstd.get_frame_parameters(no_checksum.getvalue()) + with_params = zstd.get_frame_parameters(with_checksum.getvalue()) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 0) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, 0) + self.assertFalse(no_params.has_checksum) + self.assertTrue(with_params.has_checksum) + def test_write_content_size(self): source = io.BytesIO(b'foobar' * 256) no_size = io.BytesIO() @@ -268,6 +335,15 @@ self.assertEqual(len(with_size.getvalue()), len(no_size.getvalue()) + 1) + no_params = zstd.get_frame_parameters(no_size.getvalue()) + with_params = zstd.get_frame_parameters(with_size.getvalue()) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 1536) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, 0) + self.assertFalse(no_params.has_checksum) + self.assertFalse(with_params.has_checksum) + def test_read_write_size(self): source = OpCountingBytesIO(b'foobarfoobar') dest = OpCountingBytesIO() @@ -288,18 +364,25 @@ return buffer.getvalue() +@make_cffi class TestCompressor_write_to(unittest.TestCase): def test_empty(self): - self.assertEqual(compress(b'', 1), - b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00') + result = compress(b'', 1) + self.assertEqual(result, b'\x28\xb5\x2f\xfd\x00\x48\x01\x00\x00') + + params = zstd.get_frame_parameters(result) + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 524288) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) def test_multiple_compress(self): buffer = io.BytesIO() cctx = zstd.ZstdCompressor(level=5) with cctx.write_to(buffer) as compressor: - compressor.write(b'foo') - compressor.write(b'bar') - compressor.write(b'x' * 8192) + self.assertEqual(compressor.write(b'foo'), 0) + self.assertEqual(compressor.write(b'bar'), 0) + self.assertEqual(compressor.write(b'x' * 8192), 0) result = buffer.getvalue() self.assertEqual(result, @@ -318,11 +401,23 @@ buffer = io.BytesIO() cctx = zstd.ZstdCompressor(level=9, dict_data=d) with cctx.write_to(buffer) as compressor: - compressor.write(b'foo') - compressor.write(b'bar') - compressor.write(b'foo' * 16384) + self.assertEqual(compressor.write(b'foo'), 0) + self.assertEqual(compressor.write(b'bar'), 0) + self.assertEqual(compressor.write(b'foo' * 16384), 634) compressed = buffer.getvalue() + + params = zstd.get_frame_parameters(compressed) + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 1024) + self.assertEqual(params.dict_id, d.dict_id()) + self.assertFalse(params.has_checksum) + + self.assertEqual(compressed[0:32], + b'\x28\xb5\x2f\xfd\x03\x00\x55\x7b\x6b\x5e\x54\x00' + b'\x00\x00\x02\xfc\xf4\xa5\xba\x23\x3f\x85\xb3\x54' + b'\x00\x00\x18\x6f\x6f\x66\x01\x00') + h = hashlib.sha1(compressed).hexdigest() self.assertEqual(h, '1c5bcd25181bcd8c1a73ea8773323e0056129f92') @@ -332,11 +427,18 @@ buffer = io.BytesIO() cctx = zstd.ZstdCompressor(compression_params=params) with cctx.write_to(buffer) as compressor: - compressor.write(b'foo') - compressor.write(b'bar') - compressor.write(b'foobar' * 16384) + self.assertEqual(compressor.write(b'foo'), 0) + self.assertEqual(compressor.write(b'bar'), 0) + self.assertEqual(compressor.write(b'foobar' * 16384), 0) compressed = buffer.getvalue() + + params = zstd.get_frame_parameters(compressed) + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 1048576) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) + h = hashlib.sha1(compressed).hexdigest() self.assertEqual(h, '1ae31f270ed7de14235221a604b31ecd517ebd99') @@ -344,12 +446,21 @@ no_checksum = io.BytesIO() cctx = zstd.ZstdCompressor(level=1) with cctx.write_to(no_checksum) as compressor: - compressor.write(b'foobar') + self.assertEqual(compressor.write(b'foobar'), 0) with_checksum = io.BytesIO() cctx = zstd.ZstdCompressor(level=1, write_checksum=True) with cctx.write_to(with_checksum) as compressor: - compressor.write(b'foobar') + self.assertEqual(compressor.write(b'foobar'), 0) + + no_params = zstd.get_frame_parameters(no_checksum.getvalue()) + with_params = zstd.get_frame_parameters(with_checksum.getvalue()) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 0) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, 0) + self.assertFalse(no_params.has_checksum) + self.assertTrue(with_params.has_checksum) self.assertEqual(len(with_checksum.getvalue()), len(no_checksum.getvalue()) + 4) @@ -358,12 +469,12 @@ no_size = io.BytesIO() cctx = zstd.ZstdCompressor(level=1) with cctx.write_to(no_size) as compressor: - compressor.write(b'foobar' * 256) + self.assertEqual(compressor.write(b'foobar' * 256), 0) with_size = io.BytesIO() cctx = zstd.ZstdCompressor(level=1, write_content_size=True) with cctx.write_to(with_size) as compressor: - compressor.write(b'foobar' * 256) + self.assertEqual(compressor.write(b'foobar' * 256), 0) # Source size is not known in streaming mode, so header not # written. @@ -373,7 +484,16 @@ # Declaring size will write the header. with_size = io.BytesIO() with cctx.write_to(with_size, size=len(b'foobar' * 256)) as compressor: - compressor.write(b'foobar' * 256) + self.assertEqual(compressor.write(b'foobar' * 256), 0) + + no_params = zstd.get_frame_parameters(no_size.getvalue()) + with_params = zstd.get_frame_parameters(with_size.getvalue()) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 1536) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, 0) + self.assertFalse(no_params.has_checksum) + self.assertFalse(with_params.has_checksum) self.assertEqual(len(with_size.getvalue()), len(no_size.getvalue()) + 1) @@ -390,12 +510,21 @@ with_dict_id = io.BytesIO() cctx = zstd.ZstdCompressor(level=1, dict_data=d) with cctx.write_to(with_dict_id) as compressor: - compressor.write(b'foobarfoobar') + self.assertEqual(compressor.write(b'foobarfoobar'), 0) cctx = zstd.ZstdCompressor(level=1, dict_data=d, write_dict_id=False) no_dict_id = io.BytesIO() with cctx.write_to(no_dict_id) as compressor: - compressor.write(b'foobarfoobar') + self.assertEqual(compressor.write(b'foobarfoobar'), 0) + + no_params = zstd.get_frame_parameters(no_dict_id.getvalue()) + with_params = zstd.get_frame_parameters(with_dict_id.getvalue()) + self.assertEqual(no_params.content_size, 0) + self.assertEqual(with_params.content_size, 0) + self.assertEqual(no_params.dict_id, 0) + self.assertEqual(with_params.dict_id, d.dict_id()) + self.assertFalse(no_params.has_checksum) + self.assertFalse(with_params.has_checksum) self.assertEqual(len(with_dict_id.getvalue()), len(no_dict_id.getvalue()) + 4) @@ -412,9 +541,9 @@ cctx = zstd.ZstdCompressor(level=3) dest = OpCountingBytesIO() with cctx.write_to(dest, write_size=1) as compressor: - compressor.write(b'foo') - compressor.write(b'bar') - compressor.write(b'foobar') + self.assertEqual(compressor.write(b'foo'), 0) + self.assertEqual(compressor.write(b'bar'), 0) + self.assertEqual(compressor.write(b'foobar'), 0) self.assertEqual(len(dest.getvalue()), dest._write_count) @@ -422,15 +551,15 @@ cctx = zstd.ZstdCompressor(level=3) dest = OpCountingBytesIO() with cctx.write_to(dest) as compressor: - compressor.write(b'foo') + self.assertEqual(compressor.write(b'foo'), 0) self.assertEqual(dest._write_count, 0) - compressor.flush() + self.assertEqual(compressor.flush(), 12) self.assertEqual(dest._write_count, 1) - compressor.write(b'bar') + self.assertEqual(compressor.write(b'bar'), 0) self.assertEqual(dest._write_count, 1) - compressor.flush() + self.assertEqual(compressor.flush(), 6) self.assertEqual(dest._write_count, 2) - compressor.write(b'baz') + self.assertEqual(compressor.write(b'baz'), 0) self.assertEqual(dest._write_count, 3) @@ -438,10 +567,10 @@ cctx = zstd.ZstdCompressor(level=3, write_checksum=True) dest = OpCountingBytesIO() with cctx.write_to(dest) as compressor: - compressor.write(b'foobar' * 8192) + self.assertEqual(compressor.write(b'foobar' * 8192), 0) count = dest._write_count offset = dest.tell() - compressor.flush() + self.assertEqual(compressor.flush(), 23) self.assertGreater(dest._write_count, count) self.assertGreater(dest.tell(), offset) offset = dest.tell() @@ -456,18 +585,22 @@ self.assertEqual(header, b'\x01\x00\x00') +@make_cffi class TestCompressor_read_from(unittest.TestCase): def test_type_validation(self): cctx = zstd.ZstdCompressor() # Object with read() works. - cctx.read_from(io.BytesIO()) + for chunk in cctx.read_from(io.BytesIO()): + pass # Buffer protocol works. - cctx.read_from(b'foobar') + for chunk in cctx.read_from(b'foobar'): + pass with self.assertRaisesRegexp(ValueError, 'must pass an object with a read'): - cctx.read_from(True) + for chunk in cctx.read_from(True): + pass def test_read_empty(self): cctx = zstd.ZstdCompressor(level=1) @@ -521,6 +654,12 @@ # We should get the same output as the one-shot compression mechanism. self.assertEqual(b''.join(chunks), cctx.compress(source.getvalue())) + params = zstd.get_frame_parameters(b''.join(chunks)) + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 262144) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) + # Now check the buffer protocol. it = cctx.read_from(source.getvalue()) chunks = list(it) diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_data_structures.py --- a/contrib/python-zstandard/tests/test_data_structures.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/test_data_structures.py Fri Mar 24 08:37:26 2017 -0700 @@ -13,6 +13,12 @@ import zstd +from . common import ( + make_cffi, +) + + +@make_cffi class TestCompressionParameters(unittest.TestCase): def test_init_bad_arg_type(self): with self.assertRaises(TypeError): @@ -42,7 +48,81 @@ p = zstd.get_compression_parameters(1) self.assertIsInstance(p, zstd.CompressionParameters) - self.assertEqual(p[0], 19) + self.assertEqual(p.window_log, 19) + + def test_members(self): + p = zstd.CompressionParameters(10, 6, 7, 4, 5, 8, 1) + self.assertEqual(p.window_log, 10) + self.assertEqual(p.chain_log, 6) + self.assertEqual(p.hash_log, 7) + self.assertEqual(p.search_log, 4) + self.assertEqual(p.search_length, 5) + self.assertEqual(p.target_length, 8) + self.assertEqual(p.strategy, 1) + + +@make_cffi +class TestFrameParameters(unittest.TestCase): + def test_invalid_type(self): + with self.assertRaises(TypeError): + zstd.get_frame_parameters(None) + + with self.assertRaises(TypeError): + zstd.get_frame_parameters(u'foobarbaz') + + def test_invalid_input_sizes(self): + with self.assertRaisesRegexp(zstd.ZstdError, 'not enough data for frame'): + zstd.get_frame_parameters(b'') + + with self.assertRaisesRegexp(zstd.ZstdError, 'not enough data for frame'): + zstd.get_frame_parameters(zstd.FRAME_HEADER) + + def test_invalid_frame(self): + with self.assertRaisesRegexp(zstd.ZstdError, 'Unknown frame descriptor'): + zstd.get_frame_parameters(b'foobarbaz') + + def test_attributes(self): + params = zstd.get_frame_parameters(zstd.FRAME_HEADER + b'\x00\x00') + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 1024) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) + + # Lowest 2 bits indicate a dictionary and length. Here, the dict id is 1 byte. + params = zstd.get_frame_parameters(zstd.FRAME_HEADER + b'\x01\x00\xff') + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 1024) + self.assertEqual(params.dict_id, 255) + self.assertFalse(params.has_checksum) + + # Lowest 3rd bit indicates if checksum is present. + params = zstd.get_frame_parameters(zstd.FRAME_HEADER + b'\x04\x00') + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 1024) + self.assertEqual(params.dict_id, 0) + self.assertTrue(params.has_checksum) + + # Upper 2 bits indicate content size. + params = zstd.get_frame_parameters(zstd.FRAME_HEADER + b'\x40\x00\xff\x00') + self.assertEqual(params.content_size, 511) + self.assertEqual(params.window_size, 1024) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) + + # Window descriptor is 2nd byte after frame header. + params = zstd.get_frame_parameters(zstd.FRAME_HEADER + b'\x00\x40') + self.assertEqual(params.content_size, 0) + self.assertEqual(params.window_size, 262144) + self.assertEqual(params.dict_id, 0) + self.assertFalse(params.has_checksum) + + # Set multiple things. + params = zstd.get_frame_parameters(zstd.FRAME_HEADER + b'\x45\x40\x0f\x10\x00') + self.assertEqual(params.content_size, 272) + self.assertEqual(params.window_size, 262144) + self.assertEqual(params.dict_id, 15) + self.assertTrue(params.has_checksum) + if hypothesis: s_windowlog = strategies.integers(min_value=zstd.WINDOWLOG_MIN, @@ -65,6 +145,8 @@ zstd.STRATEGY_BTLAZY2, zstd.STRATEGY_BTOPT)) + + @make_cffi class TestCompressionParametersHypothesis(unittest.TestCase): @hypothesis.given(s_windowlog, s_chainlog, s_hashlog, s_searchlog, s_searchlength, s_targetlength, s_strategy) @@ -73,9 +155,6 @@ p = zstd.CompressionParameters(windowlog, chainlog, hashlog, searchlog, searchlength, targetlength, strategy) - self.assertEqual(tuple(p), - (windowlog, chainlog, hashlog, searchlog, - searchlength, targetlength, strategy)) # Verify we can instantiate a compressor with the supplied values. # ZSTD_checkCParams moves the goal posts on us from what's advertised diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_decompressor.py --- a/contrib/python-zstandard/tests/test_decompressor.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/test_decompressor.py Fri Mar 24 08:37:26 2017 -0700 @@ -10,7 +10,10 @@ import zstd -from .common import OpCountingBytesIO +from .common import ( + make_cffi, + OpCountingBytesIO, +) if sys.version_info[0] >= 3: @@ -19,6 +22,7 @@ next = lambda it: it.next() +@make_cffi class TestDecompressor_decompress(unittest.TestCase): def test_empty_input(self): dctx = zstd.ZstdDecompressor() @@ -119,6 +123,7 @@ self.assertEqual(decompressed, sources[i]) +@make_cffi class TestDecompressor_copy_stream(unittest.TestCase): def test_no_read(self): source = object() @@ -180,6 +185,7 @@ self.assertEqual(dest._write_count, len(dest.getvalue())) +@make_cffi class TestDecompressor_decompressobj(unittest.TestCase): def test_simple(self): data = zstd.ZstdCompressor(level=1).compress(b'foobar') @@ -207,6 +213,7 @@ return buffer.getvalue() +@make_cffi class TestDecompressor_write_to(unittest.TestCase): def test_empty_roundtrip(self): cctx = zstd.ZstdCompressor() @@ -256,14 +263,14 @@ buffer = io.BytesIO() cctx = zstd.ZstdCompressor(dict_data=d) with cctx.write_to(buffer) as compressor: - compressor.write(orig) + self.assertEqual(compressor.write(orig), 1544) compressed = buffer.getvalue() buffer = io.BytesIO() dctx = zstd.ZstdDecompressor(dict_data=d) with dctx.write_to(buffer) as decompressor: - decompressor.write(compressed) + self.assertEqual(decompressor.write(compressed), len(orig)) self.assertEqual(buffer.getvalue(), orig) @@ -291,6 +298,7 @@ self.assertEqual(dest._write_count, len(dest.getvalue())) +@make_cffi class TestDecompressor_read_from(unittest.TestCase): def test_type_validation(self): dctx = zstd.ZstdDecompressor() @@ -302,7 +310,7 @@ dctx.read_from(b'foobar') with self.assertRaisesRegexp(ValueError, 'must pass an object with a read'): - dctx.read_from(True) + b''.join(dctx.read_from(True)) def test_empty_input(self): dctx = zstd.ZstdDecompressor() @@ -351,7 +359,7 @@ dctx = zstd.ZstdDecompressor() with self.assertRaisesRegexp(ValueError, 'skip_bytes must be smaller than read_size'): - dctx.read_from(b'', skip_bytes=1, read_size=1) + b''.join(dctx.read_from(b'', skip_bytes=1, read_size=1)) with self.assertRaisesRegexp(ValueError, 'skip_bytes larger than first input chunk'): b''.join(dctx.read_from(b'foobar', skip_bytes=10)) @@ -476,3 +484,94 @@ self.assertEqual(len(chunk), 1) self.assertEqual(source._read_count, len(source.getvalue())) + + +@make_cffi +class TestDecompressor_content_dict_chain(unittest.TestCase): + def test_bad_inputs_simple(self): + dctx = zstd.ZstdDecompressor() + + with self.assertRaises(TypeError): + dctx.decompress_content_dict_chain(b'foo') + + with self.assertRaises(TypeError): + dctx.decompress_content_dict_chain((b'foo', b'bar')) + + with self.assertRaisesRegexp(ValueError, 'empty input chain'): + dctx.decompress_content_dict_chain([]) + + with self.assertRaisesRegexp(ValueError, 'chunk 0 must be bytes'): + dctx.decompress_content_dict_chain([u'foo']) + + with self.assertRaisesRegexp(ValueError, 'chunk 0 must be bytes'): + dctx.decompress_content_dict_chain([True]) + + with self.assertRaisesRegexp(ValueError, 'chunk 0 is too small to contain a zstd frame'): + dctx.decompress_content_dict_chain([zstd.FRAME_HEADER]) + + with self.assertRaisesRegexp(ValueError, 'chunk 0 is not a valid zstd frame'): + dctx.decompress_content_dict_chain([b'foo' * 8]) + + no_size = zstd.ZstdCompressor().compress(b'foo' * 64) + + with self.assertRaisesRegexp(ValueError, 'chunk 0 missing content size in frame'): + dctx.decompress_content_dict_chain([no_size]) + + # Corrupt first frame. + frame = zstd.ZstdCompressor(write_content_size=True).compress(b'foo' * 64) + frame = frame[0:12] + frame[15:] + with self.assertRaisesRegexp(zstd.ZstdError, 'could not decompress chunk 0'): + dctx.decompress_content_dict_chain([frame]) + + def test_bad_subsequent_input(self): + initial = zstd.ZstdCompressor(write_content_size=True).compress(b'foo' * 64) + + dctx = zstd.ZstdDecompressor() + + with self.assertRaisesRegexp(ValueError, 'chunk 1 must be bytes'): + dctx.decompress_content_dict_chain([initial, u'foo']) + + with self.assertRaisesRegexp(ValueError, 'chunk 1 must be bytes'): + dctx.decompress_content_dict_chain([initial, None]) + + with self.assertRaisesRegexp(ValueError, 'chunk 1 is too small to contain a zstd frame'): + dctx.decompress_content_dict_chain([initial, zstd.FRAME_HEADER]) + + with self.assertRaisesRegexp(ValueError, 'chunk 1 is not a valid zstd frame'): + dctx.decompress_content_dict_chain([initial, b'foo' * 8]) + + no_size = zstd.ZstdCompressor().compress(b'foo' * 64) + + with self.assertRaisesRegexp(ValueError, 'chunk 1 missing content size in frame'): + dctx.decompress_content_dict_chain([initial, no_size]) + + # Corrupt second frame. + cctx = zstd.ZstdCompressor(write_content_size=True, dict_data=zstd.ZstdCompressionDict(b'foo' * 64)) + frame = cctx.compress(b'bar' * 64) + frame = frame[0:12] + frame[15:] + + with self.assertRaisesRegexp(zstd.ZstdError, 'could not decompress chunk 1'): + dctx.decompress_content_dict_chain([initial, frame]) + + def test_simple(self): + original = [ + b'foo' * 64, + b'foobar' * 64, + b'baz' * 64, + b'foobaz' * 64, + b'foobarbaz' * 64, + ] + + chunks = [] + chunks.append(zstd.ZstdCompressor(write_content_size=True).compress(original[0])) + for i, chunk in enumerate(original[1:]): + d = zstd.ZstdCompressionDict(original[i]) + cctx = zstd.ZstdCompressor(dict_data=d, write_content_size=True) + chunks.append(cctx.compress(chunk)) + + for i in range(1, len(original)): + chain = chunks[0:i] + expected = original[i - 1] + dctx = zstd.ZstdDecompressor() + decompressed = dctx.decompress_content_dict_chain(chain) + self.assertEqual(decompressed, expected) diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_estimate_sizes.py --- a/contrib/python-zstandard/tests/test_estimate_sizes.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/test_estimate_sizes.py Fri Mar 24 08:37:26 2017 -0700 @@ -5,7 +5,12 @@ import zstd +from . common import ( + make_cffi, +) + +@make_cffi class TestSizes(unittest.TestCase): def test_decompression_size(self): size = zstd.estimate_decompression_context_size() diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_module_attributes.py --- a/contrib/python-zstandard/tests/test_module_attributes.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/test_module_attributes.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,9 +7,15 @@ import zstd +from . common import ( + make_cffi, +) + + +@make_cffi class TestModuleAttributes(unittest.TestCase): def test_version(self): - self.assertEqual(zstd.ZSTD_VERSION, (1, 1, 2)) + self.assertEqual(zstd.ZSTD_VERSION, (1, 1, 3)) def test_constants(self): self.assertEqual(zstd.MAX_COMPRESSION_LEVEL, 22) @@ -45,4 +51,4 @@ ) for a in attrs: - self.assertTrue(hasattr(zstd, a)) + self.assertTrue(hasattr(zstd, a), a) diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_roundtrip.py --- a/contrib/python-zstandard/tests/test_roundtrip.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/test_roundtrip.py Fri Mar 24 08:37:26 2017 -0700 @@ -13,10 +13,14 @@ import zstd +from .common import ( + make_cffi, +) compression_levels = strategies.integers(min_value=1, max_value=22) +@make_cffi class TestRoundTrip(unittest.TestCase): @hypothesis.given(strategies.binary(), compression_levels) def test_compress_write_to(self, data, level): diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/tests/test_train_dictionary.py --- a/contrib/python-zstandard/tests/test_train_dictionary.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/tests/test_train_dictionary.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,6 +7,9 @@ import zstd +from . common import ( + make_cffi, +) if sys.version_info[0] >= 3: int_type = int @@ -14,6 +17,7 @@ int_type = long +@make_cffi class TestTrainDictionary(unittest.TestCase): def test_no_args(self): with self.assertRaises(TypeError): diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd.c --- a/contrib/python-zstandard/zstd.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd.c Fri Mar 24 08:37:26 2017 -0700 @@ -34,6 +34,11 @@ "Obtains a ``CompressionParameters`` instance from a compression level and\n" "optional input size and dictionary size"); +PyDoc_STRVAR(get_frame_parameters__doc__, +"get_frame_parameters(data)\n" +"\n" +"Obtains a ``FrameParameters`` instance by parsing data.\n"); + PyDoc_STRVAR(train_dictionary__doc__, "train_dictionary(dict_size, samples)\n" "\n" @@ -53,6 +58,8 @@ METH_NOARGS, estimate_decompression_context_size__doc__ }, { "get_compression_parameters", (PyCFunction)get_compression_parameters, METH_VARARGS, get_compression_parameters__doc__ }, + { "get_frame_parameters", (PyCFunction)get_frame_parameters, + METH_VARARGS, get_frame_parameters__doc__ }, { "train_dictionary", (PyCFunction)train_dictionary, METH_VARARGS | METH_KEYWORDS, train_dictionary__doc__ }, { NULL, NULL } @@ -70,6 +77,7 @@ void decompressobj_module_init(PyObject* mod); void decompressionwriter_module_init(PyObject* mod); void decompressoriterator_module_init(PyObject* mod); +void frameparams_module_init(PyObject* mod); void zstd_module_init(PyObject* m) { /* python-zstandard relies on unstable zstd C API features. This means @@ -87,7 +95,7 @@ We detect this mismatch here and refuse to load the module if this scenario is detected. */ - if (ZSTD_VERSION_NUMBER != 10102 || ZSTD_versionNumber() != 10102) { + if (ZSTD_VERSION_NUMBER != 10103 || ZSTD_versionNumber() != 10103) { PyErr_SetString(PyExc_ImportError, "zstd C API mismatch; Python bindings not compiled against expected zstd version"); return; } @@ -104,6 +112,7 @@ decompressobj_module_init(m); decompressionwriter_module_init(m); decompressoriterator_module_init(m); + frameparams_module_init(m); } #if PY_MAJOR_VERSION >= 3 diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/mem.h --- a/contrib/python-zstandard/zstd/common/mem.h Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/common/mem.h Fri Mar 24 08:37:26 2017 -0700 @@ -39,7 +39,7 @@ #endif /* code only tested on 32 and 64 bits systems */ -#define MEM_STATIC_ASSERT(c) { enum { XXH_static_assert = 1/(int)(!!(c)) }; } +#define MEM_STATIC_ASSERT(c) { enum { MEM_static_assert = 1/(int)(!!(c)) }; } MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); } diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/pool.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/zstd/common/pool.c Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,194 @@ +/** + * Copyright (c) 2016-present, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of this source tree. An additional grant + * of patent rights can be found in the PATENTS file in the same directory. + */ + + +/* ====== Dependencies ======= */ +#include /* size_t */ +#include /* malloc, calloc, free */ +#include "pool.h" + +/* ====== Compiler specifics ====== */ +#if defined(_MSC_VER) +# pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ +#endif + + +#ifdef ZSTD_MULTITHREAD + +#include "threading.h" /* pthread adaptation */ + +/* A job is a function and an opaque argument */ +typedef struct POOL_job_s { + POOL_function function; + void *opaque; +} POOL_job; + +struct POOL_ctx_s { + /* Keep track of the threads */ + pthread_t *threads; + size_t numThreads; + + /* The queue is a circular buffer */ + POOL_job *queue; + size_t queueHead; + size_t queueTail; + size_t queueSize; + /* The mutex protects the queue */ + pthread_mutex_t queueMutex; + /* Condition variable for pushers to wait on when the queue is full */ + pthread_cond_t queuePushCond; + /* Condition variables for poppers to wait on when the queue is empty */ + pthread_cond_t queuePopCond; + /* Indicates if the queue is shutting down */ + int shutdown; +}; + +/* POOL_thread() : + Work thread for the thread pool. + Waits for jobs and executes them. + @returns : NULL on failure else non-null. +*/ +static void* POOL_thread(void* opaque) { + POOL_ctx* const ctx = (POOL_ctx*)opaque; + if (!ctx) { return NULL; } + for (;;) { + /* Lock the mutex and wait for a non-empty queue or until shutdown */ + pthread_mutex_lock(&ctx->queueMutex); + while (ctx->queueHead == ctx->queueTail && !ctx->shutdown) { + pthread_cond_wait(&ctx->queuePopCond, &ctx->queueMutex); + } + /* empty => shutting down: so stop */ + if (ctx->queueHead == ctx->queueTail) { + pthread_mutex_unlock(&ctx->queueMutex); + return opaque; + } + /* Pop a job off the queue */ + { POOL_job const job = ctx->queue[ctx->queueHead]; + ctx->queueHead = (ctx->queueHead + 1) % ctx->queueSize; + /* Unlock the mutex, signal a pusher, and run the job */ + pthread_mutex_unlock(&ctx->queueMutex); + pthread_cond_signal(&ctx->queuePushCond); + job.function(job.opaque); + } + } + /* Unreachable */ +} + +POOL_ctx *POOL_create(size_t numThreads, size_t queueSize) { + POOL_ctx *ctx; + /* Check the parameters */ + if (!numThreads || !queueSize) { return NULL; } + /* Allocate the context and zero initialize */ + ctx = (POOL_ctx *)calloc(1, sizeof(POOL_ctx)); + if (!ctx) { return NULL; } + /* Initialize the job queue. + * It needs one extra space since one space is wasted to differentiate empty + * and full queues. + */ + ctx->queueSize = queueSize + 1; + ctx->queue = (POOL_job *)malloc(ctx->queueSize * sizeof(POOL_job)); + ctx->queueHead = 0; + ctx->queueTail = 0; + pthread_mutex_init(&ctx->queueMutex, NULL); + pthread_cond_init(&ctx->queuePushCond, NULL); + pthread_cond_init(&ctx->queuePopCond, NULL); + ctx->shutdown = 0; + /* Allocate space for the thread handles */ + ctx->threads = (pthread_t *)malloc(numThreads * sizeof(pthread_t)); + ctx->numThreads = 0; + /* Check for errors */ + if (!ctx->threads || !ctx->queue) { POOL_free(ctx); return NULL; } + /* Initialize the threads */ + { size_t i; + for (i = 0; i < numThreads; ++i) { + if (pthread_create(&ctx->threads[i], NULL, &POOL_thread, ctx)) { + ctx->numThreads = i; + POOL_free(ctx); + return NULL; + } } + ctx->numThreads = numThreads; + } + return ctx; +} + +/*! POOL_join() : + Shutdown the queue, wake any sleeping threads, and join all of the threads. +*/ +static void POOL_join(POOL_ctx *ctx) { + /* Shut down the queue */ + pthread_mutex_lock(&ctx->queueMutex); + ctx->shutdown = 1; + pthread_mutex_unlock(&ctx->queueMutex); + /* Wake up sleeping threads */ + pthread_cond_broadcast(&ctx->queuePushCond); + pthread_cond_broadcast(&ctx->queuePopCond); + /* Join all of the threads */ + { size_t i; + for (i = 0; i < ctx->numThreads; ++i) { + pthread_join(ctx->threads[i], NULL); + } } +} + +void POOL_free(POOL_ctx *ctx) { + if (!ctx) { return; } + POOL_join(ctx); + pthread_mutex_destroy(&ctx->queueMutex); + pthread_cond_destroy(&ctx->queuePushCond); + pthread_cond_destroy(&ctx->queuePopCond); + if (ctx->queue) free(ctx->queue); + if (ctx->threads) free(ctx->threads); + free(ctx); +} + +void POOL_add(void *ctxVoid, POOL_function function, void *opaque) { + POOL_ctx *ctx = (POOL_ctx *)ctxVoid; + if (!ctx) { return; } + + pthread_mutex_lock(&ctx->queueMutex); + { POOL_job const job = {function, opaque}; + /* Wait until there is space in the queue for the new job */ + size_t newTail = (ctx->queueTail + 1) % ctx->queueSize; + while (ctx->queueHead == newTail && !ctx->shutdown) { + pthread_cond_wait(&ctx->queuePushCond, &ctx->queueMutex); + newTail = (ctx->queueTail + 1) % ctx->queueSize; + } + /* The queue is still going => there is space */ + if (!ctx->shutdown) { + ctx->queue[ctx->queueTail] = job; + ctx->queueTail = newTail; + } + } + pthread_mutex_unlock(&ctx->queueMutex); + pthread_cond_signal(&ctx->queuePopCond); +} + +#else /* ZSTD_MULTITHREAD not defined */ +/* No multi-threading support */ + +/* We don't need any data, but if it is empty malloc() might return NULL. */ +struct POOL_ctx_s { + int data; +}; + +POOL_ctx *POOL_create(size_t numThreads, size_t queueSize) { + (void)numThreads; + (void)queueSize; + return (POOL_ctx *)malloc(sizeof(POOL_ctx)); +} + +void POOL_free(POOL_ctx *ctx) { + if (ctx) free(ctx); +} + +void POOL_add(void *ctx, POOL_function function, void *opaque) { + (void)ctx; + function(opaque); +} + +#endif /* ZSTD_MULTITHREAD */ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/pool.h --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/zstd/common/pool.h Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,56 @@ +/** + * Copyright (c) 2016-present, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of this source tree. An additional grant + * of patent rights can be found in the PATENTS file in the same directory. + */ +#ifndef POOL_H +#define POOL_H + +#if defined (__cplusplus) +extern "C" { +#endif + + +#include /* size_t */ + +typedef struct POOL_ctx_s POOL_ctx; + +/*! POOL_create() : + Create a thread pool with at most `numThreads` threads. + `numThreads` must be at least 1. + The maximum number of queued jobs before blocking is `queueSize`. + `queueSize` must be at least 1. + @return : The POOL_ctx pointer on success else NULL. +*/ +POOL_ctx *POOL_create(size_t numThreads, size_t queueSize); + +/*! POOL_free() : + Free a thread pool returned by POOL_create(). +*/ +void POOL_free(POOL_ctx *ctx); + +/*! POOL_function : + The function type that can be added to a thread pool. +*/ +typedef void (*POOL_function)(void *); +/*! POOL_add_function : + The function type for a generic thread pool add function. +*/ +typedef void (*POOL_add_function)(void *, POOL_function, void *); + +/*! POOL_add() : + Add the job `function(opaque)` to the thread pool. + Possibly blocks until there is room in the queue. + Note : The function may be executed asynchronously, so `opaque` must live until the function has been completed. +*/ +void POOL_add(void *ctx, POOL_function function, void *opaque); + + +#if defined (__cplusplus) +} +#endif + +#endif diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/threading.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/zstd/common/threading.c Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,79 @@ + +/** + * Copyright (c) 2016 Tino Reichardt + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of this source tree. An additional grant + * of patent rights can be found in the PATENTS file in the same directory. + * + * You can contact the author at: + * - zstdmt source repository: https://github.com/mcmilk/zstdmt + */ + +/** + * This file will hold wrapper for systems, which do not support pthreads + */ + +/* ====== Compiler specifics ====== */ +#if defined(_MSC_VER) +# pragma warning(disable : 4206) /* disable: C4206: translation unit is empty (when ZSTD_MULTITHREAD is not defined) */ +#endif + + +#if defined(ZSTD_MULTITHREAD) && defined(_WIN32) + +/** + * Windows minimalist Pthread Wrapper, based on : + * http://www.cse.wustl.edu/~schmidt/win32-cv-1.html + */ + + +/* === Dependencies === */ +#include +#include +#include "threading.h" + + +/* === Implementation === */ + +static unsigned __stdcall worker(void *arg) +{ + pthread_t* const thread = (pthread_t*) arg; + thread->arg = thread->start_routine(thread->arg); + return 0; +} + +int pthread_create(pthread_t* thread, const void* unused, + void* (*start_routine) (void*), void* arg) +{ + (void)unused; + thread->arg = arg; + thread->start_routine = start_routine; + thread->handle = (HANDLE) _beginthreadex(NULL, 0, worker, thread, 0, NULL); + + if (!thread->handle) + return errno; + else + return 0; +} + +int _pthread_join(pthread_t * thread, void **value_ptr) +{ + DWORD result; + + if (!thread->handle) return 0; + + result = WaitForSingleObject(thread->handle, INFINITE); + switch (result) { + case WAIT_OBJECT_0: + if (value_ptr) *value_ptr = thread->arg; + return 0; + case WAIT_ABANDONED: + return EINVAL; + default: + return GetLastError(); + } +} + +#endif /* ZSTD_MULTITHREAD */ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/threading.h --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/zstd/common/threading.h Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,104 @@ + +/** + * Copyright (c) 2016 Tino Reichardt + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of this source tree. An additional grant + * of patent rights can be found in the PATENTS file in the same directory. + * + * You can contact the author at: + * - zstdmt source repository: https://github.com/mcmilk/zstdmt + */ + +#ifndef THREADING_H_938743 +#define THREADING_H_938743 + +#if defined (__cplusplus) +extern "C" { +#endif + +#if defined(ZSTD_MULTITHREAD) && defined(_WIN32) + +/** + * Windows minimalist Pthread Wrapper, based on : + * http://www.cse.wustl.edu/~schmidt/win32-cv-1.html + */ +#ifdef WINVER +# undef WINVER +#endif +#define WINVER 0x0600 + +#ifdef _WIN32_WINNT +# undef _WIN32_WINNT +#endif +#define _WIN32_WINNT 0x0600 + +#ifndef WIN32_LEAN_AND_MEAN +# define WIN32_LEAN_AND_MEAN +#endif + +#include + +/* mutex */ +#define pthread_mutex_t CRITICAL_SECTION +#define pthread_mutex_init(a,b) InitializeCriticalSection((a)) +#define pthread_mutex_destroy(a) DeleteCriticalSection((a)) +#define pthread_mutex_lock(a) EnterCriticalSection((a)) +#define pthread_mutex_unlock(a) LeaveCriticalSection((a)) + +/* condition variable */ +#define pthread_cond_t CONDITION_VARIABLE +#define pthread_cond_init(a, b) InitializeConditionVariable((a)) +#define pthread_cond_destroy(a) /* No delete */ +#define pthread_cond_wait(a, b) SleepConditionVariableCS((a), (b), INFINITE) +#define pthread_cond_signal(a) WakeConditionVariable((a)) +#define pthread_cond_broadcast(a) WakeAllConditionVariable((a)) + +/* pthread_create() and pthread_join() */ +typedef struct { + HANDLE handle; + void* (*start_routine)(void*); + void* arg; +} pthread_t; + +int pthread_create(pthread_t* thread, const void* unused, + void* (*start_routine) (void*), void* arg); + +#define pthread_join(a, b) _pthread_join(&(a), (b)) +int _pthread_join(pthread_t* thread, void** value_ptr); + +/** + * add here more wrappers as required + */ + + +#elif defined(ZSTD_MULTITHREAD) /* posix assumed ; need a better detection mathod */ +/* === POSIX Systems === */ +# include + +#else /* ZSTD_MULTITHREAD not defined */ +/* No multithreading support */ + +#define pthread_mutex_t int /* #define rather than typedef, as sometimes pthread support is implicit, resulting in duplicated symbols */ +#define pthread_mutex_init(a,b) +#define pthread_mutex_destroy(a) +#define pthread_mutex_lock(a) +#define pthread_mutex_unlock(a) + +#define pthread_cond_t int +#define pthread_cond_init(a,b) +#define pthread_cond_destroy(a) +#define pthread_cond_wait(a,b) +#define pthread_cond_signal(a) +#define pthread_cond_broadcast(a) + +/* do not use pthread_t */ + +#endif /* ZSTD_MULTITHREAD */ + +#if defined (__cplusplus) +} +#endif + +#endif /* THREADING_H_938743 */ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/zstd_common.c --- a/contrib/python-zstandard/zstd/common/zstd_common.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/common/zstd_common.c Fri Mar 24 08:37:26 2017 -0700 @@ -43,10 +43,6 @@ * provides error code string from enum */ const char* ZSTD_getErrorString(ZSTD_ErrorCode code) { return ERR_getErrorName(code); } -/* --- ZBUFF Error Management (deprecated) --- */ -unsigned ZBUFF_isError(size_t errorCode) { return ERR_isError(errorCode); } -const char* ZBUFF_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); } - /*=************************************************************** * Custom allocator diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/zstd_errors.h --- a/contrib/python-zstandard/zstd/common/zstd_errors.h Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/common/zstd_errors.h Fri Mar 24 08:37:26 2017 -0700 @@ -18,6 +18,20 @@ #include /* size_t */ +/* ===== ZSTDERRORLIB_API : control library symbols visibility ===== */ +#if defined(__GNUC__) && (__GNUC__ >= 4) +# define ZSTDERRORLIB_VISIBILITY __attribute__ ((visibility ("default"))) +#else +# define ZSTDERRORLIB_VISIBILITY +#endif +#if defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) +# define ZSTDERRORLIB_API __declspec(dllexport) ZSTDERRORLIB_VISIBILITY +#elif defined(ZSTD_DLL_IMPORT) && (ZSTD_DLL_IMPORT==1) +# define ZSTDERRORLIB_API __declspec(dllimport) ZSTDERRORLIB_VISIBILITY /* It isn't required but allows to generate better code, saving a function pointer load from the IAT and an indirect jump.*/ +#else +# define ZSTDERRORLIB_API ZSTDERRORLIB_VISIBILITY +#endif + /*-**************************************** * error codes list ******************************************/ @@ -49,8 +63,8 @@ /*! ZSTD_getErrorCode() : convert a `size_t` function result into a `ZSTD_ErrorCode` enum type, which can be used to compare directly with enum list published into "error_public.h" */ -ZSTD_ErrorCode ZSTD_getErrorCode(size_t functionResult); -const char* ZSTD_getErrorString(ZSTD_ErrorCode code); +ZSTDERRORLIB_API ZSTD_ErrorCode ZSTD_getErrorCode(size_t functionResult); +ZSTDERRORLIB_API const char* ZSTD_getErrorString(ZSTD_ErrorCode code); #if defined (__cplusplus) diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/common/zstd_internal.h --- a/contrib/python-zstandard/zstd/common/zstd_internal.h Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/common/zstd_internal.h Fri Mar 24 08:37:26 2017 -0700 @@ -267,4 +267,13 @@ } +/* hidden functions */ + +/* ZSTD_invalidateRepCodes() : + * ensures next compression will not use repcodes from previous block. + * Note : only works with regular variant; + * do not use with extDict variant ! */ +void ZSTD_invalidateRepCodes(ZSTD_CCtx* cctx); + + #endif /* ZSTD_CCOMMON_H_MODULE */ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/compress/zstd_compress.c --- a/contrib/python-zstandard/zstd/compress/zstd_compress.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/compress/zstd_compress.c Fri Mar 24 08:37:26 2017 -0700 @@ -51,8 +51,7 @@ /*-************************************* * Context memory management ***************************************/ -struct ZSTD_CCtx_s -{ +struct ZSTD_CCtx_s { const BYTE* nextSrc; /* next block here to continue on current prefix */ const BYTE* base; /* All regular indexes relative to this position */ const BYTE* dictBase; /* extDict indexes relative to this position */ @@ -61,10 +60,11 @@ U32 nextToUpdate; /* index from which to continue dictionary update */ U32 nextToUpdate3; /* index from which to continue dictionary update */ U32 hashLog3; /* dispatch table : larger == faster, more memory */ - U32 loadedDictEnd; + U32 loadedDictEnd; /* index of end of dictionary */ + U32 forceWindow; /* force back-references to respect limit of 1<customMem), &customMem, sizeof(customMem)); + cctx->customMem = customMem; return cctx; } @@ -119,6 +119,15 @@ return sizeof(*cctx) + cctx->workSpaceSize; } +size_t ZSTD_setCCtxParameter(ZSTD_CCtx* cctx, ZSTD_CCtxParameter param, unsigned value) +{ + switch(param) + { + case ZSTD_p_forceWindow : cctx->forceWindow = value>0; cctx->loadedDictEnd = 0; return 0; + default: return ERROR(parameter_unknown); + } +} + const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx) /* hidden interface */ { return &(ctx->seqStore); @@ -318,6 +327,14 @@ } } +/* ZSTD_invalidateRepCodes() : + * ensures next compression will not use repcodes from previous block. + * Note : only works with regular variant; + * do not use with extDict variant ! */ +void ZSTD_invalidateRepCodes(ZSTD_CCtx* cctx) { + int i; + for (i=0; irep[i] = 0; +} /*! ZSTD_copyCCtx() : * Duplicate an existing context `srcCCtx` into another one `dstCCtx`. @@ -735,12 +752,19 @@ if ((size_t)(op-ostart) >= maxCSize) return 0; } /* confirm repcodes */ - { int i; for (i=0; irep[i] = zc->savedRep[i]; } + { int i; for (i=0; irep[i] = zc->repToConfirm[i]; } return op - ostart; } +#if 0 /* for debug */ +# define STORESEQ_DEBUG +#include /* fprintf */ +U32 g_startDebug = 0; +const BYTE* g_start = NULL; +#endif + /*! ZSTD_storeSeq() : Store a sequence (literal length, literals, offset code and match length code) into seqStore_t. `offsetCode` : distance to match, or 0 == repCode. @@ -748,13 +772,14 @@ */ MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const void* literals, U32 offsetCode, size_t matchCode) { -#if 0 /* for debug */ - static const BYTE* g_start = NULL; - const U32 pos = (U32)((const BYTE*)literals - g_start); - if (g_start==NULL) g_start = (const BYTE*)literals; - //if ((pos > 1) && (pos < 50000)) - printf("Cpos %6u :%5u literals & match %3u bytes at distance %6u \n", - pos, (U32)litLength, (U32)matchCode+MINMATCH, (U32)offsetCode); +#ifdef STORESEQ_DEBUG + if (g_startDebug) { + const U32 pos = (U32)((const BYTE*)literals - g_start); + if (g_start==NULL) g_start = (const BYTE*)literals; + if ((pos > 1895000) && (pos < 1895300)) + fprintf(stderr, "Cpos %6u :%5u literals & match %3u bytes at distance %6u \n", + pos, (U32)litLength, (U32)matchCode+MINMATCH, (U32)offsetCode); + } #endif /* copy Literals */ ZSTD_wildcopy(seqStorePtr->lit, literals, litLength); @@ -1004,8 +1029,8 @@ } } } /* save reps for next block */ - cctx->savedRep[0] = offset_1 ? offset_1 : offsetSaved; - cctx->savedRep[1] = offset_2 ? offset_2 : offsetSaved; + cctx->repToConfirm[0] = offset_1 ? offset_1 : offsetSaved; + cctx->repToConfirm[1] = offset_2 ? offset_2 : offsetSaved; /* Last Literals */ { size_t const lastLLSize = iend - anchor; @@ -1119,7 +1144,7 @@ } } } /* save reps for next block */ - ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2; + ctx->repToConfirm[0] = offset_1; ctx->repToConfirm[1] = offset_2; /* Last Literals */ { size_t const lastLLSize = iend - anchor; @@ -1273,8 +1298,8 @@ } } } /* save reps for next block */ - cctx->savedRep[0] = offset_1 ? offset_1 : offsetSaved; - cctx->savedRep[1] = offset_2 ? offset_2 : offsetSaved; + cctx->repToConfirm[0] = offset_1 ? offset_1 : offsetSaved; + cctx->repToConfirm[1] = offset_2 ? offset_2 : offsetSaved; /* Last Literals */ { size_t const lastLLSize = iend - anchor; @@ -1423,7 +1448,7 @@ } } } /* save reps for next block */ - ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2; + ctx->repToConfirm[0] = offset_1; ctx->repToConfirm[1] = offset_2; /* Last Literals */ { size_t const lastLLSize = iend - anchor; @@ -1955,8 +1980,8 @@ } } /* Save reps for next block */ - ctx->savedRep[0] = offset_1 ? offset_1 : savedOffset; - ctx->savedRep[1] = offset_2 ? offset_2 : savedOffset; + ctx->repToConfirm[0] = offset_1 ? offset_1 : savedOffset; + ctx->repToConfirm[1] = offset_2 ? offset_2 : savedOffset; /* Last Literals */ { size_t const lastLLSize = iend - anchor; @@ -2150,7 +2175,7 @@ } } /* Save reps for next block */ - ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2; + ctx->repToConfirm[0] = offset_1; ctx->repToConfirm[1] = offset_2; /* Last Literals */ { size_t const lastLLSize = iend - anchor; @@ -2409,12 +2434,14 @@ cctx->nextSrc = ip + srcSize; - { size_t const cSize = frame ? + if (srcSize) { + size_t const cSize = frame ? ZSTD_compress_generic (cctx, dst, dstCapacity, src, srcSize, lastFrameChunk) : ZSTD_compressBlock_internal (cctx, dst, dstCapacity, src, srcSize); if (ZSTD_isError(cSize)) return cSize; return cSize + fhSize; - } + } else + return fhSize; } @@ -2450,7 +2477,7 @@ zc->dictBase = zc->base; zc->base += ip - zc->nextSrc; zc->nextToUpdate = zc->dictLimit; - zc->loadedDictEnd = (U32)(iend - zc->base); + zc->loadedDictEnd = zc->forceWindow ? 0 : (U32)(iend - zc->base); zc->nextSrc = iend; if (srcSize <= HASH_READ_SIZE) return 0; @@ -2557,9 +2584,9 @@ } if (dictPtr+12 > dictEnd) return ERROR(dictionary_corrupted); - cctx->rep[0] = MEM_readLE32(dictPtr+0); if (cctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted); - cctx->rep[1] = MEM_readLE32(dictPtr+4); if (cctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted); - cctx->rep[2] = MEM_readLE32(dictPtr+8); if (cctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted); + cctx->rep[0] = MEM_readLE32(dictPtr+0); if (cctx->rep[0] == 0 || cctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted); + cctx->rep[1] = MEM_readLE32(dictPtr+4); if (cctx->rep[1] == 0 || cctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted); + cctx->rep[2] = MEM_readLE32(dictPtr+8); if (cctx->rep[2] == 0 || cctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted); dictPtr += 12; { U32 offcodeMax = MaxOff; @@ -2594,7 +2621,6 @@ } } - /*! ZSTD_compressBegin_internal() : * @return : 0, or an error code */ static size_t ZSTD_compressBegin_internal(ZSTD_CCtx* cctx, @@ -2626,9 +2652,9 @@ } -size_t ZSTD_compressBegin(ZSTD_CCtx* zc, int compressionLevel) +size_t ZSTD_compressBegin(ZSTD_CCtx* cctx, int compressionLevel) { - return ZSTD_compressBegin_usingDict(zc, NULL, 0, compressionLevel); + return ZSTD_compressBegin_usingDict(cctx, NULL, 0, compressionLevel); } @@ -2733,7 +2759,8 @@ /* ===== Dictionary API ===== */ struct ZSTD_CDict_s { - void* dictContent; + void* dictBuffer; + const void* dictContent; size_t dictContentSize; ZSTD_CCtx* refContext; }; /* typedef'd tp ZSTD_CDict within "zstd.h" */ @@ -2741,39 +2768,45 @@ size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict) { if (cdict==NULL) return 0; /* support sizeof on NULL */ - return ZSTD_sizeof_CCtx(cdict->refContext) + cdict->dictContentSize; + return ZSTD_sizeof_CCtx(cdict->refContext) + (cdict->dictBuffer ? cdict->dictContentSize : 0) + sizeof(*cdict); } -ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, ZSTD_parameters params, ZSTD_customMem customMem) +ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, unsigned byReference, + ZSTD_parameters params, ZSTD_customMem customMem) { if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem; if (!customMem.customAlloc || !customMem.customFree) return NULL; { ZSTD_CDict* const cdict = (ZSTD_CDict*) ZSTD_malloc(sizeof(ZSTD_CDict), customMem); - void* const dictContent = ZSTD_malloc(dictSize, customMem); ZSTD_CCtx* const cctx = ZSTD_createCCtx_advanced(customMem); - if (!dictContent || !cdict || !cctx) { - ZSTD_free(dictContent, customMem); + if (!cdict || !cctx) { ZSTD_free(cdict, customMem); ZSTD_free(cctx, customMem); return NULL; } - if (dictSize) { - memcpy(dictContent, dict, dictSize); + if ((byReference) || (!dictBuffer) || (!dictSize)) { + cdict->dictBuffer = NULL; + cdict->dictContent = dictBuffer; + } else { + void* const internalBuffer = ZSTD_malloc(dictSize, customMem); + if (!internalBuffer) { ZSTD_free(cctx, customMem); ZSTD_free(cdict, customMem); return NULL; } + memcpy(internalBuffer, dictBuffer, dictSize); + cdict->dictBuffer = internalBuffer; + cdict->dictContent = internalBuffer; } - { size_t const errorCode = ZSTD_compressBegin_advanced(cctx, dictContent, dictSize, params, 0); + + { size_t const errorCode = ZSTD_compressBegin_advanced(cctx, cdict->dictContent, dictSize, params, 0); if (ZSTD_isError(errorCode)) { - ZSTD_free(dictContent, customMem); + ZSTD_free(cdict->dictBuffer, customMem); + ZSTD_free(cctx, customMem); ZSTD_free(cdict, customMem); - ZSTD_free(cctx, customMem); return NULL; } } - cdict->dictContent = dictContent; + cdict->refContext = cctx; cdict->dictContentSize = dictSize; - cdict->refContext = cctx; return cdict; } } @@ -2783,7 +2816,15 @@ ZSTD_customMem const allocator = { NULL, NULL, NULL }; ZSTD_parameters params = ZSTD_getParams(compressionLevel, 0, dictSize); params.fParams.contentSizeFlag = 1; - return ZSTD_createCDict_advanced(dict, dictSize, params, allocator); + return ZSTD_createCDict_advanced(dict, dictSize, 0, params, allocator); +} + +ZSTD_CDict* ZSTD_createCDict_byReference(const void* dict, size_t dictSize, int compressionLevel) +{ + ZSTD_customMem const allocator = { NULL, NULL, NULL }; + ZSTD_parameters params = ZSTD_getParams(compressionLevel, 0, dictSize); + params.fParams.contentSizeFlag = 1; + return ZSTD_createCDict_advanced(dict, dictSize, 1, params, allocator); } size_t ZSTD_freeCDict(ZSTD_CDict* cdict) @@ -2791,7 +2832,7 @@ if (cdict==NULL) return 0; /* support free on NULL */ { ZSTD_customMem const cMem = cdict->refContext->customMem; ZSTD_freeCCtx(cdict->refContext); - ZSTD_free(cdict->dictContent, cMem); + ZSTD_free(cdict->dictBuffer, cMem); ZSTD_free(cdict, cMem); return 0; } @@ -2801,7 +2842,7 @@ return ZSTD_getParamsFromCCtx(cdict->refContext); } -size_t ZSTD_compressBegin_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict, U64 pledgedSrcSize) +size_t ZSTD_compressBegin_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict, unsigned long long pledgedSrcSize) { if (cdict->dictContentSize) CHECK_F(ZSTD_copyCCtx(cctx, cdict->refContext, pledgedSrcSize)) else CHECK_F(ZSTD_compressBegin_advanced(cctx, NULL, 0, cdict->refContext->params, pledgedSrcSize)); @@ -2900,7 +2941,7 @@ size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize) { - if (zcs->inBuffSize==0) return ERROR(stage_wrong); /* zcs has not been init at least once */ + if (zcs->inBuffSize==0) return ERROR(stage_wrong); /* zcs has not been init at least once => can't reset */ if (zcs->cdict) CHECK_F(ZSTD_compressBegin_usingCDict(zcs->cctx, zcs->cdict, pledgedSrcSize)) else CHECK_F(ZSTD_compressBegin_advanced(zcs->cctx, NULL, 0, zcs->params, pledgedSrcSize)); @@ -2937,9 +2978,9 @@ if (zcs->outBuff == NULL) return ERROR(memory_allocation); } - if (dict) { + if (dict && dictSize >= 8) { ZSTD_freeCDict(zcs->cdictLocal); - zcs->cdictLocal = ZSTD_createCDict_advanced(dict, dictSize, params, zcs->customMem); + zcs->cdictLocal = ZSTD_createCDict_advanced(dict, dictSize, 0, params, zcs->customMem); if (zcs->cdictLocal == NULL) return ERROR(memory_allocation); zcs->cdict = zcs->cdictLocal; } else zcs->cdict = NULL; @@ -2956,6 +2997,7 @@ ZSTD_parameters const params = ZSTD_getParamsFromCDict(cdict); size_t const initError = ZSTD_initCStream_advanced(zcs, NULL, 0, params, 0); zcs->cdict = cdict; + zcs->cctx->dictID = params.fParams.noDictIDFlag ? 0 : cdict->refContext->dictID; return initError; } @@ -2967,7 +3009,8 @@ size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize) { - ZSTD_parameters const params = ZSTD_getParams(compressionLevel, pledgedSrcSize, 0); + ZSTD_parameters params = ZSTD_getParams(compressionLevel, pledgedSrcSize, 0); + if (pledgedSrcSize) params.fParams.contentSizeFlag = 1; return ZSTD_initCStream_advanced(zcs, NULL, 0, params, pledgedSrcSize); } diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/compress/zstd_opt.h --- a/contrib/python-zstandard/zstd/compress/zstd_opt.h Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/compress/zstd_opt.h Fri Mar 24 08:37:26 2017 -0700 @@ -38,7 +38,7 @@ ssPtr->cachedLiterals = NULL; ssPtr->cachedPrice = ssPtr->cachedLitLength = 0; - ssPtr->staticPrices = 0; + ssPtr->staticPrices = 0; if (ssPtr->litLengthSum == 0) { if (srcSize <= 1024) ssPtr->staticPrices = 1; @@ -56,7 +56,7 @@ for (u=0; u<=MaxLit; u++) { ssPtr->litFreq[u] = 1 + (ssPtr->litFreq[u]>>ZSTD_FREQ_DIV); - ssPtr->litSum += ssPtr->litFreq[u]; + ssPtr->litSum += ssPtr->litFreq[u]; } for (u=0; u<=MaxLL; u++) ssPtr->litLengthFreq[u] = 1; @@ -634,7 +634,7 @@ } } /* for (cur=0; cur < last_pos; ) */ /* Save reps for next block */ - { int i; for (i=0; isavedRep[i] = rep[i]; } + { int i; for (i=0; irepToConfirm[i] = rep[i]; } /* Last Literals */ { size_t const lastLLSize = iend - anchor; @@ -825,7 +825,7 @@ match_num = ZSTD_BtGetAllMatches_selectMLS_extDict(ctx, inr, iend, maxSearches, mls, matches, minMatch); - if (match_num > 0 && matches[match_num-1].len > sufficient_len) { + if (match_num > 0 && (matches[match_num-1].len > sufficient_len || cur + matches[match_num-1].len >= ZSTD_OPT_NUM)) { best_mlen = matches[match_num-1].len; best_off = matches[match_num-1].off; last_pos = cur + 1; @@ -835,7 +835,7 @@ /* set prices using matches at position = cur */ for (u = 0; u < match_num; u++) { mlen = (u>0) ? matches[u-1].len+1 : best_mlen; - best_mlen = (cur + matches[u].len < ZSTD_OPT_NUM) ? matches[u].len : ZSTD_OPT_NUM - cur; + best_mlen = matches[u].len; while (mlen <= best_mlen) { if (opt[cur].mlen == 1) { @@ -907,7 +907,7 @@ } } /* for (cur=0; cur < last_pos; ) */ /* Save reps for next block */ - { int i; for (i=0; isavedRep[i] = rep[i]; } + { int i; for (i=0; irepToConfirm[i] = rep[i]; } /* Last Literals */ { size_t lastLLSize = iend - anchor; diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/compress/zstdmt_compress.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/zstd/compress/zstdmt_compress.c Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,740 @@ +/** + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of this source tree. An additional grant + * of patent rights can be found in the PATENTS file in the same directory. + */ + + +/* ====== Tuning parameters ====== */ +#define ZSTDMT_NBTHREADS_MAX 128 + + +/* ====== Compiler specifics ====== */ +#if defined(_MSC_VER) +# pragma warning(disable : 4204) /* disable: C4204: non-constant aggregate initializer */ +#endif + + +/* ====== Dependencies ====== */ +#include /* malloc */ +#include /* memcpy */ +#include "pool.h" /* threadpool */ +#include "threading.h" /* mutex */ +#include "zstd_internal.h" /* MIN, ERROR, ZSTD_*, ZSTD_highbit32 */ +#include "zstdmt_compress.h" +#define XXH_STATIC_LINKING_ONLY /* XXH64_state_t */ +#include "xxhash.h" + + +/* ====== Debug ====== */ +#if 0 + +# include +# include +# include + static unsigned g_debugLevel = 3; +# define DEBUGLOGRAW(l, ...) if (l<=g_debugLevel) { fprintf(stderr, __VA_ARGS__); } +# define DEBUGLOG(l, ...) if (l<=g_debugLevel) { fprintf(stderr, __FILE__ ": "); fprintf(stderr, __VA_ARGS__); fprintf(stderr, " \n"); } + +# define DEBUG_PRINTHEX(l,p,n) { \ + unsigned debug_u; \ + for (debug_u=0; debug_u<(n); debug_u++) \ + DEBUGLOGRAW(l, "%02X ", ((const unsigned char*)(p))[debug_u]); \ + DEBUGLOGRAW(l, " \n"); \ +} + +static unsigned long long GetCurrentClockTimeMicroseconds() +{ + static clock_t _ticksPerSecond = 0; + if (_ticksPerSecond <= 0) _ticksPerSecond = sysconf(_SC_CLK_TCK); + + struct tms junk; clock_t newTicks = (clock_t) times(&junk); + return ((((unsigned long long)newTicks)*(1000000))/_ticksPerSecond); +} + +#define MUTEX_WAIT_TIME_DLEVEL 5 +#define PTHREAD_MUTEX_LOCK(mutex) \ +if (g_debugLevel>=MUTEX_WAIT_TIME_DLEVEL) { \ + unsigned long long beforeTime = GetCurrentClockTimeMicroseconds(); \ + pthread_mutex_lock(mutex); \ + unsigned long long afterTime = GetCurrentClockTimeMicroseconds(); \ + unsigned long long elapsedTime = (afterTime-beforeTime); \ + if (elapsedTime > 1000) { /* or whatever threshold you like; I'm using 1 millisecond here */ \ + DEBUGLOG(MUTEX_WAIT_TIME_DLEVEL, "Thread took %llu microseconds to acquire mutex %s \n", \ + elapsedTime, #mutex); \ + } \ +} else pthread_mutex_lock(mutex); + +#else + +# define DEBUGLOG(l, ...) {} /* disabled */ +# define PTHREAD_MUTEX_LOCK(m) pthread_mutex_lock(m) +# define DEBUG_PRINTHEX(l,p,n) {} + +#endif + + +/* ===== Buffer Pool ===== */ + +typedef struct buffer_s { + void* start; + size_t size; +} buffer_t; + +static const buffer_t g_nullBuffer = { NULL, 0 }; + +typedef struct ZSTDMT_bufferPool_s { + unsigned totalBuffers; + unsigned nbBuffers; + buffer_t bTable[1]; /* variable size */ +} ZSTDMT_bufferPool; + +static ZSTDMT_bufferPool* ZSTDMT_createBufferPool(unsigned nbThreads) +{ + unsigned const maxNbBuffers = 2*nbThreads + 2; + ZSTDMT_bufferPool* const bufPool = (ZSTDMT_bufferPool*)calloc(1, sizeof(ZSTDMT_bufferPool) + (maxNbBuffers-1) * sizeof(buffer_t)); + if (bufPool==NULL) return NULL; + bufPool->totalBuffers = maxNbBuffers; + bufPool->nbBuffers = 0; + return bufPool; +} + +static void ZSTDMT_freeBufferPool(ZSTDMT_bufferPool* bufPool) +{ + unsigned u; + if (!bufPool) return; /* compatibility with free on NULL */ + for (u=0; utotalBuffers; u++) + free(bufPool->bTable[u].start); + free(bufPool); +} + +/* assumption : invocation from main thread only ! */ +static buffer_t ZSTDMT_getBuffer(ZSTDMT_bufferPool* pool, size_t bSize) +{ + if (pool->nbBuffers) { /* try to use an existing buffer */ + buffer_t const buf = pool->bTable[--(pool->nbBuffers)]; + size_t const availBufferSize = buf.size; + if ((availBufferSize >= bSize) & (availBufferSize <= 10*bSize)) /* large enough, but not too much */ + return buf; + free(buf.start); /* size conditions not respected : scratch this buffer and create a new one */ + } + /* create new buffer */ + { buffer_t buffer; + void* const start = malloc(bSize); + if (start==NULL) bSize = 0; + buffer.start = start; /* note : start can be NULL if malloc fails ! */ + buffer.size = bSize; + return buffer; + } +} + +/* store buffer for later re-use, up to pool capacity */ +static void ZSTDMT_releaseBuffer(ZSTDMT_bufferPool* pool, buffer_t buf) +{ + if (buf.start == NULL) return; /* release on NULL */ + if (pool->nbBuffers < pool->totalBuffers) { + pool->bTable[pool->nbBuffers++] = buf; /* store for later re-use */ + return; + } + /* Reached bufferPool capacity (should not happen) */ + free(buf.start); +} + + +/* ===== CCtx Pool ===== */ + +typedef struct { + unsigned totalCCtx; + unsigned availCCtx; + ZSTD_CCtx* cctx[1]; /* variable size */ +} ZSTDMT_CCtxPool; + +/* assumption : CCtxPool invocation only from main thread */ + +/* note : all CCtx borrowed from the pool should be released back to the pool _before_ freeing the pool */ +static void ZSTDMT_freeCCtxPool(ZSTDMT_CCtxPool* pool) +{ + unsigned u; + for (u=0; utotalCCtx; u++) + ZSTD_freeCCtx(pool->cctx[u]); /* note : compatible with free on NULL */ + free(pool); +} + +/* ZSTDMT_createCCtxPool() : + * implies nbThreads >= 1 , checked by caller ZSTDMT_createCCtx() */ +static ZSTDMT_CCtxPool* ZSTDMT_createCCtxPool(unsigned nbThreads) +{ + ZSTDMT_CCtxPool* const cctxPool = (ZSTDMT_CCtxPool*) calloc(1, sizeof(ZSTDMT_CCtxPool) + (nbThreads-1)*sizeof(ZSTD_CCtx*)); + if (!cctxPool) return NULL; + cctxPool->totalCCtx = nbThreads; + cctxPool->availCCtx = 1; /* at least one cctx for single-thread mode */ + cctxPool->cctx[0] = ZSTD_createCCtx(); + if (!cctxPool->cctx[0]) { ZSTDMT_freeCCtxPool(cctxPool); return NULL; } + DEBUGLOG(1, "cctxPool created, with %u threads", nbThreads); + return cctxPool; +} + +static ZSTD_CCtx* ZSTDMT_getCCtx(ZSTDMT_CCtxPool* pool) +{ + if (pool->availCCtx) { + pool->availCCtx--; + return pool->cctx[pool->availCCtx]; + } + return ZSTD_createCCtx(); /* note : can be NULL, when creation fails ! */ +} + +static void ZSTDMT_releaseCCtx(ZSTDMT_CCtxPool* pool, ZSTD_CCtx* cctx) +{ + if (cctx==NULL) return; /* compatibility with release on NULL */ + if (pool->availCCtx < pool->totalCCtx) + pool->cctx[pool->availCCtx++] = cctx; + else + /* pool overflow : should not happen, since totalCCtx==nbThreads */ + ZSTD_freeCCtx(cctx); +} + + +/* ===== Thread worker ===== */ + +typedef struct { + buffer_t buffer; + size_t filled; +} inBuff_t; + +typedef struct { + ZSTD_CCtx* cctx; + buffer_t src; + const void* srcStart; + size_t srcSize; + size_t dictSize; + buffer_t dstBuff; + size_t cSize; + size_t dstFlushed; + unsigned firstChunk; + unsigned lastChunk; + unsigned jobCompleted; + unsigned jobScanned; + pthread_mutex_t* jobCompleted_mutex; + pthread_cond_t* jobCompleted_cond; + ZSTD_parameters params; + ZSTD_CDict* cdict; + unsigned long long fullFrameSize; +} ZSTDMT_jobDescription; + +/* ZSTDMT_compressChunk() : POOL_function type */ +void ZSTDMT_compressChunk(void* jobDescription) +{ + ZSTDMT_jobDescription* const job = (ZSTDMT_jobDescription*)jobDescription; + const void* const src = (const char*)job->srcStart + job->dictSize; + buffer_t const dstBuff = job->dstBuff; + DEBUGLOG(3, "job (first:%u) (last:%u) : dictSize %u, srcSize %u", job->firstChunk, job->lastChunk, (U32)job->dictSize, (U32)job->srcSize); + if (job->cdict) { + size_t const initError = ZSTD_compressBegin_usingCDict(job->cctx, job->cdict, job->fullFrameSize); + if (job->cdict) DEBUGLOG(3, "using CDict "); + if (ZSTD_isError(initError)) { job->cSize = initError; goto _endJob; } + } else { + size_t const initError = ZSTD_compressBegin_advanced(job->cctx, job->srcStart, job->dictSize, job->params, job->fullFrameSize); + if (ZSTD_isError(initError)) { job->cSize = initError; goto _endJob; } + ZSTD_setCCtxParameter(job->cctx, ZSTD_p_forceWindow, 1); + } + if (!job->firstChunk) { /* flush frame header */ + size_t const hSize = ZSTD_compressContinue(job->cctx, dstBuff.start, dstBuff.size, src, 0); + if (ZSTD_isError(hSize)) { job->cSize = hSize; goto _endJob; } + ZSTD_invalidateRepCodes(job->cctx); + } + + DEBUGLOG(4, "Compressing : "); + DEBUG_PRINTHEX(4, job->srcStart, 12); + job->cSize = (job->lastChunk) ? /* last chunk signal */ + ZSTD_compressEnd (job->cctx, dstBuff.start, dstBuff.size, src, job->srcSize) : + ZSTD_compressContinue(job->cctx, dstBuff.start, dstBuff.size, src, job->srcSize); + DEBUGLOG(3, "compressed %u bytes into %u bytes (first:%u) (last:%u)", (unsigned)job->srcSize, (unsigned)job->cSize, job->firstChunk, job->lastChunk); + +_endJob: + PTHREAD_MUTEX_LOCK(job->jobCompleted_mutex); + job->jobCompleted = 1; + job->jobScanned = 0; + pthread_cond_signal(job->jobCompleted_cond); + pthread_mutex_unlock(job->jobCompleted_mutex); +} + + +/* ------------------------------------------ */ +/* ===== Multi-threaded compression ===== */ +/* ------------------------------------------ */ + +struct ZSTDMT_CCtx_s { + POOL_ctx* factory; + ZSTDMT_bufferPool* buffPool; + ZSTDMT_CCtxPool* cctxPool; + pthread_mutex_t jobCompleted_mutex; + pthread_cond_t jobCompleted_cond; + size_t targetSectionSize; + size_t marginSize; + size_t inBuffSize; + size_t dictSize; + size_t targetDictSize; + inBuff_t inBuff; + ZSTD_parameters params; + XXH64_state_t xxhState; + unsigned nbThreads; + unsigned jobIDMask; + unsigned doneJobID; + unsigned nextJobID; + unsigned frameEnded; + unsigned allJobsCompleted; + unsigned overlapRLog; + unsigned long long frameContentSize; + size_t sectionSize; + ZSTD_CDict* cdict; + ZSTD_CStream* cstream; + ZSTDMT_jobDescription jobs[1]; /* variable size (must lies at the end) */ +}; + +ZSTDMT_CCtx *ZSTDMT_createCCtx(unsigned nbThreads) +{ + ZSTDMT_CCtx* cctx; + U32 const minNbJobs = nbThreads + 2; + U32 const nbJobsLog2 = ZSTD_highbit32(minNbJobs) + 1; + U32 const nbJobs = 1 << nbJobsLog2; + DEBUGLOG(5, "nbThreads : %u ; minNbJobs : %u ; nbJobsLog2 : %u ; nbJobs : %u \n", + nbThreads, minNbJobs, nbJobsLog2, nbJobs); + if ((nbThreads < 1) | (nbThreads > ZSTDMT_NBTHREADS_MAX)) return NULL; + cctx = (ZSTDMT_CCtx*) calloc(1, sizeof(ZSTDMT_CCtx) + nbJobs*sizeof(ZSTDMT_jobDescription)); + if (!cctx) return NULL; + cctx->nbThreads = nbThreads; + cctx->jobIDMask = nbJobs - 1; + cctx->allJobsCompleted = 1; + cctx->sectionSize = 0; + cctx->overlapRLog = 3; + cctx->factory = POOL_create(nbThreads, 1); + cctx->buffPool = ZSTDMT_createBufferPool(nbThreads); + cctx->cctxPool = ZSTDMT_createCCtxPool(nbThreads); + if (!cctx->factory | !cctx->buffPool | !cctx->cctxPool) { /* one object was not created */ + ZSTDMT_freeCCtx(cctx); + return NULL; + } + if (nbThreads==1) { + cctx->cstream = ZSTD_createCStream(); + if (!cctx->cstream) { + ZSTDMT_freeCCtx(cctx); return NULL; + } } + pthread_mutex_init(&cctx->jobCompleted_mutex, NULL); /* Todo : check init function return */ + pthread_cond_init(&cctx->jobCompleted_cond, NULL); + DEBUGLOG(4, "mt_cctx created, for %u threads \n", nbThreads); + return cctx; +} + +/* ZSTDMT_releaseAllJobResources() : + * Ensure all workers are killed first. */ +static void ZSTDMT_releaseAllJobResources(ZSTDMT_CCtx* mtctx) +{ + unsigned jobID; + for (jobID=0; jobID <= mtctx->jobIDMask; jobID++) { + ZSTDMT_releaseBuffer(mtctx->buffPool, mtctx->jobs[jobID].dstBuff); + mtctx->jobs[jobID].dstBuff = g_nullBuffer; + ZSTDMT_releaseBuffer(mtctx->buffPool, mtctx->jobs[jobID].src); + mtctx->jobs[jobID].src = g_nullBuffer; + ZSTDMT_releaseCCtx(mtctx->cctxPool, mtctx->jobs[jobID].cctx); + mtctx->jobs[jobID].cctx = NULL; + } + memset(mtctx->jobs, 0, (mtctx->jobIDMask+1)*sizeof(ZSTDMT_jobDescription)); + ZSTDMT_releaseBuffer(mtctx->buffPool, mtctx->inBuff.buffer); + mtctx->inBuff.buffer = g_nullBuffer; + mtctx->allJobsCompleted = 1; +} + +size_t ZSTDMT_freeCCtx(ZSTDMT_CCtx* mtctx) +{ + if (mtctx==NULL) return 0; /* compatible with free on NULL */ + POOL_free(mtctx->factory); + if (!mtctx->allJobsCompleted) ZSTDMT_releaseAllJobResources(mtctx); /* stop workers first */ + ZSTDMT_freeBufferPool(mtctx->buffPool); /* release job resources into pools first */ + ZSTDMT_freeCCtxPool(mtctx->cctxPool); + ZSTD_freeCDict(mtctx->cdict); + ZSTD_freeCStream(mtctx->cstream); + pthread_mutex_destroy(&mtctx->jobCompleted_mutex); + pthread_cond_destroy(&mtctx->jobCompleted_cond); + free(mtctx); + return 0; +} + +size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSDTMT_parameter parameter, unsigned value) +{ + switch(parameter) + { + case ZSTDMT_p_sectionSize : + mtctx->sectionSize = value; + return 0; + case ZSTDMT_p_overlapSectionLog : + DEBUGLOG(4, "ZSTDMT_p_overlapSectionLog : %u", value); + mtctx->overlapRLog = (value >= 9) ? 0 : 9 - value; + return 0; + default : + return ERROR(compressionParameter_unsupported); + } +} + + +/* ------------------------------------------ */ +/* ===== Multi-threaded compression ===== */ +/* ------------------------------------------ */ + +size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* mtctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + int compressionLevel) +{ + ZSTD_parameters params = ZSTD_getParams(compressionLevel, srcSize, 0); + size_t const chunkTargetSize = (size_t)1 << (params.cParams.windowLog + 2); + unsigned const nbChunksMax = (unsigned)(srcSize / chunkTargetSize) + (srcSize < chunkTargetSize) /* min 1 */; + unsigned nbChunks = MIN(nbChunksMax, mtctx->nbThreads); + size_t const proposedChunkSize = (srcSize + (nbChunks-1)) / nbChunks; + size_t const avgChunkSize = ((proposedChunkSize & 0x1FFFF) < 0xFFFF) ? proposedChunkSize + 0xFFFF : proposedChunkSize; /* avoid too small last block */ + size_t remainingSrcSize = srcSize; + const char* const srcStart = (const char*)src; + size_t frameStartPos = 0; + + DEBUGLOG(3, "windowLog : %2u => chunkTargetSize : %u bytes ", params.cParams.windowLog, (U32)chunkTargetSize); + DEBUGLOG(2, "nbChunks : %2u (chunkSize : %u bytes) ", nbChunks, (U32)avgChunkSize); + params.fParams.contentSizeFlag = 1; + + if (nbChunks==1) { /* fallback to single-thread mode */ + ZSTD_CCtx* const cctx = mtctx->cctxPool->cctx[0]; + return ZSTD_compressCCtx(cctx, dst, dstCapacity, src, srcSize, compressionLevel); + } + + { unsigned u; + for (u=0; ubuffPool, dstBufferCapacity) : dstAsBuffer; + ZSTD_CCtx* const cctx = ZSTDMT_getCCtx(mtctx->cctxPool); + + if ((cctx==NULL) || (dstBuffer.start==NULL)) { + mtctx->jobs[u].cSize = ERROR(memory_allocation); /* job result */ + mtctx->jobs[u].jobCompleted = 1; + nbChunks = u+1; + break; /* let's wait for previous jobs to complete, but don't start new ones */ + } + + mtctx->jobs[u].srcStart = srcStart + frameStartPos; + mtctx->jobs[u].srcSize = chunkSize; + mtctx->jobs[u].fullFrameSize = srcSize; + mtctx->jobs[u].params = params; + mtctx->jobs[u].dstBuff = dstBuffer; + mtctx->jobs[u].cctx = cctx; + mtctx->jobs[u].firstChunk = (u==0); + mtctx->jobs[u].lastChunk = (u==nbChunks-1); + mtctx->jobs[u].jobCompleted = 0; + mtctx->jobs[u].jobCompleted_mutex = &mtctx->jobCompleted_mutex; + mtctx->jobs[u].jobCompleted_cond = &mtctx->jobCompleted_cond; + + DEBUGLOG(3, "posting job %u (%u bytes)", u, (U32)chunkSize); + DEBUG_PRINTHEX(3, mtctx->jobs[u].srcStart, 12); + POOL_add(mtctx->factory, ZSTDMT_compressChunk, &mtctx->jobs[u]); + + frameStartPos += chunkSize; + remainingSrcSize -= chunkSize; + } } + /* note : since nbChunks <= nbThreads, all jobs should be running immediately in parallel */ + + { unsigned chunkID; + size_t error = 0, dstPos = 0; + for (chunkID=0; chunkIDjobCompleted_mutex); + while (mtctx->jobs[chunkID].jobCompleted==0) { + DEBUGLOG(4, "waiting for jobCompleted signal from chunk %u", chunkID); + pthread_cond_wait(&mtctx->jobCompleted_cond, &mtctx->jobCompleted_mutex); + } + pthread_mutex_unlock(&mtctx->jobCompleted_mutex); + DEBUGLOG(3, "ready to write chunk %u ", chunkID); + + ZSTDMT_releaseCCtx(mtctx->cctxPool, mtctx->jobs[chunkID].cctx); + mtctx->jobs[chunkID].cctx = NULL; + mtctx->jobs[chunkID].srcStart = NULL; + { size_t const cSize = mtctx->jobs[chunkID].cSize; + if (ZSTD_isError(cSize)) error = cSize; + if ((!error) && (dstPos + cSize > dstCapacity)) error = ERROR(dstSize_tooSmall); + if (chunkID) { /* note : chunk 0 is already written directly into dst */ + if (!error) memcpy((char*)dst + dstPos, mtctx->jobs[chunkID].dstBuff.start, cSize); + ZSTDMT_releaseBuffer(mtctx->buffPool, mtctx->jobs[chunkID].dstBuff); + mtctx->jobs[chunkID].dstBuff = g_nullBuffer; + } + dstPos += cSize ; + } + } + if (!error) DEBUGLOG(3, "compressed size : %u ", (U32)dstPos); + return error ? error : dstPos; + } + +} + + +/* ====================================== */ +/* ======= Streaming API ======= */ +/* ====================================== */ + +static void ZSTDMT_waitForAllJobsCompleted(ZSTDMT_CCtx* zcs) { + while (zcs->doneJobID < zcs->nextJobID) { + unsigned const jobID = zcs->doneJobID & zcs->jobIDMask; + PTHREAD_MUTEX_LOCK(&zcs->jobCompleted_mutex); + while (zcs->jobs[jobID].jobCompleted==0) { + DEBUGLOG(4, "waiting for jobCompleted signal from chunk %u", zcs->doneJobID); /* we want to block when waiting for data to flush */ + pthread_cond_wait(&zcs->jobCompleted_cond, &zcs->jobCompleted_mutex); + } + pthread_mutex_unlock(&zcs->jobCompleted_mutex); + zcs->doneJobID++; + } +} + + +static size_t ZSTDMT_initCStream_internal(ZSTDMT_CCtx* zcs, + const void* dict, size_t dictSize, unsigned updateDict, + ZSTD_parameters params, unsigned long long pledgedSrcSize) +{ + ZSTD_customMem const cmem = { NULL, NULL, NULL }; + DEBUGLOG(3, "Started new compression, with windowLog : %u", params.cParams.windowLog); + if (zcs->nbThreads==1) return ZSTD_initCStream_advanced(zcs->cstream, dict, dictSize, params, pledgedSrcSize); + if (zcs->allJobsCompleted == 0) { /* previous job not correctly finished */ + ZSTDMT_waitForAllJobsCompleted(zcs); + ZSTDMT_releaseAllJobResources(zcs); + zcs->allJobsCompleted = 1; + } + zcs->params = params; + if (updateDict) { + ZSTD_freeCDict(zcs->cdict); zcs->cdict = NULL; + if (dict && dictSize) { + zcs->cdict = ZSTD_createCDict_advanced(dict, dictSize, 0, params, cmem); + if (zcs->cdict == NULL) return ERROR(memory_allocation); + } } + zcs->frameContentSize = pledgedSrcSize; + zcs->targetDictSize = (zcs->overlapRLog>=9) ? 0 : (size_t)1 << (zcs->params.cParams.windowLog - zcs->overlapRLog); + DEBUGLOG(4, "overlapRLog : %u ", zcs->overlapRLog); + DEBUGLOG(3, "overlap Size : %u KB", (U32)(zcs->targetDictSize>>10)); + zcs->targetSectionSize = zcs->sectionSize ? zcs->sectionSize : (size_t)1 << (zcs->params.cParams.windowLog + 2); + zcs->targetSectionSize = MAX(ZSTDMT_SECTION_SIZE_MIN, zcs->targetSectionSize); + zcs->targetSectionSize = MAX(zcs->targetDictSize, zcs->targetSectionSize); + DEBUGLOG(3, "Section Size : %u KB", (U32)(zcs->targetSectionSize>>10)); + zcs->marginSize = zcs->targetSectionSize >> 2; + zcs->inBuffSize = zcs->targetDictSize + zcs->targetSectionSize + zcs->marginSize; + zcs->inBuff.buffer = ZSTDMT_getBuffer(zcs->buffPool, zcs->inBuffSize); + if (zcs->inBuff.buffer.start == NULL) return ERROR(memory_allocation); + zcs->inBuff.filled = 0; + zcs->dictSize = 0; + zcs->doneJobID = 0; + zcs->nextJobID = 0; + zcs->frameEnded = 0; + zcs->allJobsCompleted = 0; + if (params.fParams.checksumFlag) XXH64_reset(&zcs->xxhState, 0); + return 0; +} + +size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* zcs, + const void* dict, size_t dictSize, + ZSTD_parameters params, unsigned long long pledgedSrcSize) +{ + return ZSTDMT_initCStream_internal(zcs, dict, dictSize, 1, params, pledgedSrcSize); +} + +/* ZSTDMT_resetCStream() : + * pledgedSrcSize is optional and can be zero == unknown */ +size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* zcs, unsigned long long pledgedSrcSize) +{ + if (zcs->nbThreads==1) return ZSTD_resetCStream(zcs->cstream, pledgedSrcSize); + return ZSTDMT_initCStream_internal(zcs, NULL, 0, 0, zcs->params, pledgedSrcSize); +} + +size_t ZSTDMT_initCStream(ZSTDMT_CCtx* zcs, int compressionLevel) { + ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, 0); + return ZSTDMT_initCStream_internal(zcs, NULL, 0, 1, params, 0); +} + + +static size_t ZSTDMT_createCompressionJob(ZSTDMT_CCtx* zcs, size_t srcSize, unsigned endFrame) +{ + size_t const dstBufferCapacity = ZSTD_compressBound(srcSize); + buffer_t const dstBuffer = ZSTDMT_getBuffer(zcs->buffPool, dstBufferCapacity); + ZSTD_CCtx* const cctx = ZSTDMT_getCCtx(zcs->cctxPool); + unsigned const jobID = zcs->nextJobID & zcs->jobIDMask; + + if ((cctx==NULL) || (dstBuffer.start==NULL)) { + zcs->jobs[jobID].jobCompleted = 1; + zcs->nextJobID++; + ZSTDMT_waitForAllJobsCompleted(zcs); + ZSTDMT_releaseAllJobResources(zcs); + return ERROR(memory_allocation); + } + + DEBUGLOG(4, "preparing job %u to compress %u bytes with %u preload ", zcs->nextJobID, (U32)srcSize, (U32)zcs->dictSize); + zcs->jobs[jobID].src = zcs->inBuff.buffer; + zcs->jobs[jobID].srcStart = zcs->inBuff.buffer.start; + zcs->jobs[jobID].srcSize = srcSize; + zcs->jobs[jobID].dictSize = zcs->dictSize; /* note : zcs->inBuff.filled is presumed >= srcSize + dictSize */ + zcs->jobs[jobID].params = zcs->params; + if (zcs->nextJobID) zcs->jobs[jobID].params.fParams.checksumFlag = 0; /* do not calculate checksum within sections, just keep it in header for first section */ + zcs->jobs[jobID].cdict = zcs->nextJobID==0 ? zcs->cdict : NULL; + zcs->jobs[jobID].fullFrameSize = zcs->frameContentSize; + zcs->jobs[jobID].dstBuff = dstBuffer; + zcs->jobs[jobID].cctx = cctx; + zcs->jobs[jobID].firstChunk = (zcs->nextJobID==0); + zcs->jobs[jobID].lastChunk = endFrame; + zcs->jobs[jobID].jobCompleted = 0; + zcs->jobs[jobID].dstFlushed = 0; + zcs->jobs[jobID].jobCompleted_mutex = &zcs->jobCompleted_mutex; + zcs->jobs[jobID].jobCompleted_cond = &zcs->jobCompleted_cond; + + /* get a new buffer for next input */ + if (!endFrame) { + size_t const newDictSize = MIN(srcSize + zcs->dictSize, zcs->targetDictSize); + zcs->inBuff.buffer = ZSTDMT_getBuffer(zcs->buffPool, zcs->inBuffSize); + if (zcs->inBuff.buffer.start == NULL) { /* not enough memory to allocate next input buffer */ + zcs->jobs[jobID].jobCompleted = 1; + zcs->nextJobID++; + ZSTDMT_waitForAllJobsCompleted(zcs); + ZSTDMT_releaseAllJobResources(zcs); + return ERROR(memory_allocation); + } + DEBUGLOG(5, "inBuff filled to %u", (U32)zcs->inBuff.filled); + zcs->inBuff.filled -= srcSize + zcs->dictSize - newDictSize; + DEBUGLOG(5, "new job : filled to %u, with %u dict and %u src", (U32)zcs->inBuff.filled, (U32)newDictSize, (U32)(zcs->inBuff.filled - newDictSize)); + memmove(zcs->inBuff.buffer.start, (const char*)zcs->jobs[jobID].srcStart + zcs->dictSize + srcSize - newDictSize, zcs->inBuff.filled); + DEBUGLOG(5, "new inBuff pre-filled"); + zcs->dictSize = newDictSize; + } else { + zcs->inBuff.buffer = g_nullBuffer; + zcs->inBuff.filled = 0; + zcs->dictSize = 0; + zcs->frameEnded = 1; + if (zcs->nextJobID == 0) + zcs->params.fParams.checksumFlag = 0; /* single chunk : checksum is calculated directly within worker thread */ + } + + DEBUGLOG(3, "posting job %u : %u bytes (end:%u) (note : doneJob = %u=>%u)", zcs->nextJobID, (U32)zcs->jobs[jobID].srcSize, zcs->jobs[jobID].lastChunk, zcs->doneJobID, zcs->doneJobID & zcs->jobIDMask); + POOL_add(zcs->factory, ZSTDMT_compressChunk, &zcs->jobs[jobID]); /* this call is blocking when thread worker pool is exhausted */ + zcs->nextJobID++; + return 0; +} + + +/* ZSTDMT_flushNextJob() : + * output : will be updated with amount of data flushed . + * blockToFlush : if >0, the function will block and wait if there is no data available to flush . + * @return : amount of data remaining within internal buffer, 1 if unknown but > 0, 0 if no more, or an error code */ +static size_t ZSTDMT_flushNextJob(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, unsigned blockToFlush) +{ + unsigned const wJobID = zcs->doneJobID & zcs->jobIDMask; + if (zcs->doneJobID == zcs->nextJobID) return 0; /* all flushed ! */ + PTHREAD_MUTEX_LOCK(&zcs->jobCompleted_mutex); + while (zcs->jobs[wJobID].jobCompleted==0) { + DEBUGLOG(5, "waiting for jobCompleted signal from job %u", zcs->doneJobID); + if (!blockToFlush) { pthread_mutex_unlock(&zcs->jobCompleted_mutex); return 0; } /* nothing ready to be flushed => skip */ + pthread_cond_wait(&zcs->jobCompleted_cond, &zcs->jobCompleted_mutex); /* block when nothing available to flush */ + } + pthread_mutex_unlock(&zcs->jobCompleted_mutex); + /* compression job completed : output can be flushed */ + { ZSTDMT_jobDescription job = zcs->jobs[wJobID]; + if (!job.jobScanned) { + if (ZSTD_isError(job.cSize)) { + DEBUGLOG(5, "compression error detected "); + ZSTDMT_waitForAllJobsCompleted(zcs); + ZSTDMT_releaseAllJobResources(zcs); + return job.cSize; + } + ZSTDMT_releaseCCtx(zcs->cctxPool, job.cctx); + zcs->jobs[wJobID].cctx = NULL; + DEBUGLOG(5, "zcs->params.fParams.checksumFlag : %u ", zcs->params.fParams.checksumFlag); + if (zcs->params.fParams.checksumFlag) { + XXH64_update(&zcs->xxhState, (const char*)job.srcStart + job.dictSize, job.srcSize); + if (zcs->frameEnded && (zcs->doneJobID+1 == zcs->nextJobID)) { /* write checksum at end of last section */ + U32 const checksum = (U32)XXH64_digest(&zcs->xxhState); + DEBUGLOG(4, "writing checksum : %08X \n", checksum); + MEM_writeLE32((char*)job.dstBuff.start + job.cSize, checksum); + job.cSize += 4; + zcs->jobs[wJobID].cSize += 4; + } } + ZSTDMT_releaseBuffer(zcs->buffPool, job.src); + zcs->jobs[wJobID].srcStart = NULL; + zcs->jobs[wJobID].src = g_nullBuffer; + zcs->jobs[wJobID].jobScanned = 1; + } + { size_t const toWrite = MIN(job.cSize - job.dstFlushed, output->size - output->pos); + DEBUGLOG(4, "Flushing %u bytes from job %u ", (U32)toWrite, zcs->doneJobID); + memcpy((char*)output->dst + output->pos, (const char*)job.dstBuff.start + job.dstFlushed, toWrite); + output->pos += toWrite; + job.dstFlushed += toWrite; + } + if (job.dstFlushed == job.cSize) { /* output buffer fully flushed => move to next one */ + ZSTDMT_releaseBuffer(zcs->buffPool, job.dstBuff); + zcs->jobs[wJobID].dstBuff = g_nullBuffer; + zcs->jobs[wJobID].jobCompleted = 0; + zcs->doneJobID++; + } else { + zcs->jobs[wJobID].dstFlushed = job.dstFlushed; + } + /* return value : how many bytes left in buffer ; fake it to 1 if unknown but >0 */ + if (job.cSize > job.dstFlushed) return (job.cSize - job.dstFlushed); + if (zcs->doneJobID < zcs->nextJobID) return 1; /* still some buffer to flush */ + zcs->allJobsCompleted = zcs->frameEnded; /* frame completed and entirely flushed */ + return 0; /* everything flushed */ +} } + + +size_t ZSTDMT_compressStream(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, ZSTD_inBuffer* input) +{ + size_t const newJobThreshold = zcs->dictSize + zcs->targetSectionSize + zcs->marginSize; + if (zcs->frameEnded) return ERROR(stage_wrong); /* current frame being ended. Only flush is allowed. Restart with init */ + if (zcs->nbThreads==1) return ZSTD_compressStream(zcs->cstream, output, input); + + /* fill input buffer */ + { size_t const toLoad = MIN(input->size - input->pos, zcs->inBuffSize - zcs->inBuff.filled); + memcpy((char*)zcs->inBuff.buffer.start + zcs->inBuff.filled, input->src, toLoad); + input->pos += toLoad; + zcs->inBuff.filled += toLoad; + } + + if ( (zcs->inBuff.filled >= newJobThreshold) /* filled enough : let's compress */ + && (zcs->nextJobID <= zcs->doneJobID + zcs->jobIDMask) ) { /* avoid overwriting job round buffer */ + CHECK_F( ZSTDMT_createCompressionJob(zcs, zcs->targetSectionSize, 0) ); + } + + /* check for data to flush */ + CHECK_F( ZSTDMT_flushNextJob(zcs, output, (zcs->inBuff.filled == zcs->inBuffSize)) ); /* block if it wasn't possible to create new job due to saturation */ + + /* recommended next input size : fill current input buffer */ + return zcs->inBuffSize - zcs->inBuff.filled; /* note : could be zero when input buffer is fully filled and no more availability to create new job */ +} + + +static size_t ZSTDMT_flushStream_internal(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output, unsigned endFrame) +{ + size_t const srcSize = zcs->inBuff.filled - zcs->dictSize; + + if (srcSize) DEBUGLOG(4, "flushing : %u bytes left to compress", (U32)srcSize); + if ( ((srcSize > 0) || (endFrame && !zcs->frameEnded)) + && (zcs->nextJobID <= zcs->doneJobID + zcs->jobIDMask) ) { + CHECK_F( ZSTDMT_createCompressionJob(zcs, srcSize, endFrame) ); + } + + /* check if there is any data available to flush */ + DEBUGLOG(5, "zcs->doneJobID : %u ; zcs->nextJobID : %u ", zcs->doneJobID, zcs->nextJobID); + return ZSTDMT_flushNextJob(zcs, output, 1); +} + + +size_t ZSTDMT_flushStream(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output) +{ + if (zcs->nbThreads==1) return ZSTD_flushStream(zcs->cstream, output); + return ZSTDMT_flushStream_internal(zcs, output, 0); +} + +size_t ZSTDMT_endStream(ZSTDMT_CCtx* zcs, ZSTD_outBuffer* output) +{ + if (zcs->nbThreads==1) return ZSTD_endStream(zcs->cstream, output); + return ZSTDMT_flushStream_internal(zcs, output, 1); +} diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/compress/zstdmt_compress.h --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/zstd/compress/zstdmt_compress.h Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,78 @@ +/** + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of this source tree. An additional grant + * of patent rights can be found in the PATENTS file in the same directory. + */ + + #ifndef ZSTDMT_COMPRESS_H + #define ZSTDMT_COMPRESS_H + + #if defined (__cplusplus) + extern "C" { + #endif + + +/* Note : All prototypes defined in this file shall be considered experimental. + * There is no guarantee of API continuity (yet) on any of these prototypes */ + +/* === Dependencies === */ +#include /* size_t */ +#define ZSTD_STATIC_LINKING_ONLY /* ZSTD_parameters */ +#include "zstd.h" /* ZSTD_inBuffer, ZSTD_outBuffer, ZSTDLIB_API */ + + +/* === Simple one-pass functions === */ + +typedef struct ZSTDMT_CCtx_s ZSTDMT_CCtx; +ZSTDLIB_API ZSTDMT_CCtx* ZSTDMT_createCCtx(unsigned nbThreads); +ZSTDLIB_API size_t ZSTDMT_freeCCtx(ZSTDMT_CCtx* cctx); + +ZSTDLIB_API size_t ZSTDMT_compressCCtx(ZSTDMT_CCtx* cctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + int compressionLevel); + + +/* === Streaming functions === */ + +ZSTDLIB_API size_t ZSTDMT_initCStream(ZSTDMT_CCtx* mtctx, int compressionLevel); +ZSTDLIB_API size_t ZSTDMT_resetCStream(ZSTDMT_CCtx* mtctx, unsigned long long pledgedSrcSize); /**< pledgedSrcSize is optional and can be zero == unknown */ + +ZSTDLIB_API size_t ZSTDMT_compressStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output, ZSTD_inBuffer* input); + +ZSTDLIB_API size_t ZSTDMT_flushStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output); /**< @return : 0 == all flushed; >0 : still some data to be flushed; or an error code (ZSTD_isError()) */ +ZSTDLIB_API size_t ZSTDMT_endStream(ZSTDMT_CCtx* mtctx, ZSTD_outBuffer* output); /**< @return : 0 == all flushed; >0 : still some data to be flushed; or an error code (ZSTD_isError()) */ + + +/* === Advanced functions and parameters === */ + +#ifndef ZSTDMT_SECTION_SIZE_MIN +# define ZSTDMT_SECTION_SIZE_MIN (1U << 20) /* 1 MB - Minimum size of each compression job */ +#endif + +ZSTDLIB_API size_t ZSTDMT_initCStream_advanced(ZSTDMT_CCtx* mtctx, const void* dict, size_t dictSize, /**< dict can be released after init, a local copy is preserved within zcs */ + ZSTD_parameters params, unsigned long long pledgedSrcSize); /**< pledgedSrcSize is optional and can be zero == unknown */ + +/* ZSDTMT_parameter : + * List of parameters that can be set using ZSTDMT_setMTCtxParameter() */ +typedef enum { + ZSTDMT_p_sectionSize, /* size of input "section". Each section is compressed in parallel. 0 means default, which is dynamically determined within compression functions */ + ZSTDMT_p_overlapSectionLog /* Log of overlapped section; 0 == no overlap, 6(default) == use 1/8th of window, >=9 == use full window */ +} ZSDTMT_parameter; + +/* ZSTDMT_setMTCtxParameter() : + * allow setting individual parameters, one at a time, among a list of enums defined in ZSTDMT_parameter. + * The function must be called typically after ZSTD_createCCtx(). + * Parameters not explicitly reset by ZSTDMT_init*() remain the same in consecutive compression sessions. + * @return : 0, or an error code (which can be tested using ZSTD_isError()) */ +ZSTDLIB_API size_t ZSTDMT_setMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSDTMT_parameter parameter, unsigned value); + + +#if defined (__cplusplus) +} +#endif + +#endif /* ZSTDMT_COMPRESS_H */ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/decompress/zstd_decompress.c --- a/contrib/python-zstandard/zstd/decompress/zstd_decompress.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/decompress/zstd_decompress.c Fri Mar 24 08:37:26 2017 -0700 @@ -1444,7 +1444,7 @@ #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1) if (ZSTD_isLegacy(src, srcSize)) return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, dict, dictSize); #endif - ZSTD_decompressBegin_usingDict(dctx, dict, dictSize); + CHECK_F(ZSTD_decompressBegin_usingDict(dctx, dict, dictSize)); ZSTD_checkContinuity(dctx, dst); return ZSTD_decompressFrame(dctx, dst, dstCapacity, src, srcSize); } @@ -1671,9 +1671,9 @@ } if (dictPtr+12 > dictEnd) return ERROR(dictionary_corrupted); - dctx->rep[0] = MEM_readLE32(dictPtr+0); if (dctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted); - dctx->rep[1] = MEM_readLE32(dictPtr+4); if (dctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted); - dctx->rep[2] = MEM_readLE32(dictPtr+8); if (dctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted); + dctx->rep[0] = MEM_readLE32(dictPtr+0); if (dctx->rep[0] == 0 || dctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted); + dctx->rep[1] = MEM_readLE32(dictPtr+4); if (dctx->rep[1] == 0 || dctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted); + dctx->rep[2] = MEM_readLE32(dictPtr+8); if (dctx->rep[2] == 0 || dctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted); dictPtr += 12; dctx->litEntropy = dctx->fseEntropy = 1; @@ -1713,39 +1713,44 @@ /* ====== ZSTD_DDict ====== */ struct ZSTD_DDict_s { - void* dict; + void* dictBuffer; + const void* dictContent; size_t dictSize; ZSTD_DCtx* refContext; }; /* typedef'd to ZSTD_DDict within "zstd.h" */ -ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, ZSTD_customMem customMem) +ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, unsigned byReference, ZSTD_customMem customMem) { if (!customMem.customAlloc && !customMem.customFree) customMem = defaultCustomMem; if (!customMem.customAlloc || !customMem.customFree) return NULL; { ZSTD_DDict* const ddict = (ZSTD_DDict*) ZSTD_malloc(sizeof(ZSTD_DDict), customMem); - void* const dictContent = ZSTD_malloc(dictSize, customMem); ZSTD_DCtx* const dctx = ZSTD_createDCtx_advanced(customMem); - if (!dictContent || !ddict || !dctx) { - ZSTD_free(dictContent, customMem); + if (!ddict || !dctx) { ZSTD_free(ddict, customMem); ZSTD_free(dctx, customMem); return NULL; } - if (dictSize) { - memcpy(dictContent, dict, dictSize); + if ((byReference) || (!dict) || (!dictSize)) { + ddict->dictBuffer = NULL; + ddict->dictContent = dict; + } else { + void* const internalBuffer = ZSTD_malloc(dictSize, customMem); + if (!internalBuffer) { ZSTD_free(dctx, customMem); ZSTD_free(ddict, customMem); return NULL; } + memcpy(internalBuffer, dict, dictSize); + ddict->dictBuffer = internalBuffer; + ddict->dictContent = internalBuffer; } - { size_t const errorCode = ZSTD_decompressBegin_usingDict(dctx, dictContent, dictSize); + { size_t const errorCode = ZSTD_decompressBegin_usingDict(dctx, ddict->dictContent, dictSize); if (ZSTD_isError(errorCode)) { - ZSTD_free(dictContent, customMem); + ZSTD_free(ddict->dictBuffer, customMem); ZSTD_free(ddict, customMem); ZSTD_free(dctx, customMem); return NULL; } } - ddict->dict = dictContent; ddict->dictSize = dictSize; ddict->refContext = dctx; return ddict; @@ -1758,15 +1763,27 @@ ZSTD_DDict* ZSTD_createDDict(const void* dict, size_t dictSize) { ZSTD_customMem const allocator = { NULL, NULL, NULL }; - return ZSTD_createDDict_advanced(dict, dictSize, allocator); + return ZSTD_createDDict_advanced(dict, dictSize, 0, allocator); } + +/*! ZSTD_createDDict_byReference() : + * Create a digested dictionary, ready to start decompression operation without startup delay. + * Dictionary content is simply referenced, and therefore stays in dictBuffer. + * It is important that dictBuffer outlives DDict, it must remain read accessible throughout the lifetime of DDict */ +ZSTD_DDict* ZSTD_createDDict_byReference(const void* dictBuffer, size_t dictSize) +{ + ZSTD_customMem const allocator = { NULL, NULL, NULL }; + return ZSTD_createDDict_advanced(dictBuffer, dictSize, 1, allocator); +} + + size_t ZSTD_freeDDict(ZSTD_DDict* ddict) { if (ddict==NULL) return 0; /* support free on NULL */ { ZSTD_customMem const cMem = ddict->refContext->customMem; ZSTD_freeDCtx(ddict->refContext); - ZSTD_free(ddict->dict, cMem); + ZSTD_free(ddict->dictBuffer, cMem); ZSTD_free(ddict, cMem); return 0; } @@ -1775,7 +1792,7 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict) { if (ddict==NULL) return 0; /* support sizeof on NULL */ - return sizeof(*ddict) + sizeof(ddict->refContext) + ddict->dictSize; + return sizeof(*ddict) + ZSTD_sizeof_DCtx(ddict->refContext) + (ddict->dictBuffer ? ddict->dictSize : 0) ; } /*! ZSTD_getDictID_fromDict() : @@ -1796,7 +1813,7 @@ unsigned ZSTD_getDictID_fromDDict(const ZSTD_DDict* ddict) { if (ddict==NULL) return 0; - return ZSTD_getDictID_fromDict(ddict->dict, ddict->dictSize); + return ZSTD_getDictID_fromDict(ddict->dictContent, ddict->dictSize); } /*! ZSTD_getDictID_fromFrame() : @@ -1827,7 +1844,7 @@ const ZSTD_DDict* ddict) { #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1) - if (ZSTD_isLegacy(src, srcSize)) return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, ddict->dict, ddict->dictSize); + if (ZSTD_isLegacy(src, srcSize)) return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, ddict->dictContent, ddict->dictSize); #endif ZSTD_refDCtx(dctx, ddict->refContext); ZSTD_checkContinuity(dctx, dst); @@ -1919,7 +1936,7 @@ zds->stage = zdss_loadHeader; zds->lhSize = zds->inPos = zds->outStart = zds->outEnd = 0; ZSTD_freeDDict(zds->ddictLocal); - if (dict) { + if (dict && dictSize >= 8) { zds->ddictLocal = ZSTD_createDDict(dict, dictSize); if (zds->ddictLocal == NULL) return ERROR(memory_allocation); } else zds->ddictLocal = NULL; @@ -1956,7 +1973,7 @@ switch(paramType) { default : return ERROR(parameter_unknown); - case ZSTDdsp_maxWindowSize : zds->maxWindowSize = paramValue ? paramValue : (U32)(-1); break; + case DStream_p_maxWindowSize : zds->maxWindowSize = paramValue ? paramValue : (U32)(-1); break; } return 0; } @@ -2007,7 +2024,7 @@ #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT>=1) { U32 const legacyVersion = ZSTD_isLegacy(istart, iend-istart); if (legacyVersion) { - const void* const dict = zds->ddict ? zds->ddict->dict : NULL; + const void* const dict = zds->ddict ? zds->ddict->dictContent : NULL; size_t const dictSize = zds->ddict ? zds->ddict->dictSize : 0; CHECK_F(ZSTD_initLegacyStream(&zds->legacyContext, zds->previousLegacyVersion, legacyVersion, dict, dictSize)); diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/dictBuilder/cover.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/contrib/python-zstandard/zstd/dictBuilder/cover.c Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,1021 @@ +/** + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of this source tree. An additional grant + * of patent rights can be found in the PATENTS file in the same directory. + */ + +/*-************************************* +* Dependencies +***************************************/ +#include /* fprintf */ +#include /* malloc, free, qsort */ +#include /* memset */ +#include /* clock */ + +#include "mem.h" /* read */ +#include "pool.h" +#include "threading.h" +#include "zstd_internal.h" /* includes zstd.h */ +#ifndef ZDICT_STATIC_LINKING_ONLY +#define ZDICT_STATIC_LINKING_ONLY +#endif +#include "zdict.h" + +/*-************************************* +* Constants +***************************************/ +#define COVER_MAX_SAMPLES_SIZE (sizeof(size_t) == 8 ? ((U32)-1) : ((U32)1 GB)) + +/*-************************************* +* Console display +***************************************/ +static int g_displayLevel = 2; +#define DISPLAY(...) \ + { \ + fprintf(stderr, __VA_ARGS__); \ + fflush(stderr); \ + } +#define LOCALDISPLAYLEVEL(displayLevel, l, ...) \ + if (displayLevel >= l) { \ + DISPLAY(__VA_ARGS__); \ + } /* 0 : no display; 1: errors; 2: default; 3: details; 4: debug */ +#define DISPLAYLEVEL(l, ...) LOCALDISPLAYLEVEL(g_displayLevel, l, __VA_ARGS__) + +#define LOCALDISPLAYUPDATE(displayLevel, l, ...) \ + if (displayLevel >= l) { \ + if ((clock() - g_time > refreshRate) || (displayLevel >= 4)) { \ + g_time = clock(); \ + DISPLAY(__VA_ARGS__); \ + if (displayLevel >= 4) \ + fflush(stdout); \ + } \ + } +#define DISPLAYUPDATE(l, ...) LOCALDISPLAYUPDATE(g_displayLevel, l, __VA_ARGS__) +static const clock_t refreshRate = CLOCKS_PER_SEC * 15 / 100; +static clock_t g_time = 0; + +/*-************************************* +* Hash table +*************************************** +* A small specialized hash map for storing activeDmers. +* The map does not resize, so if it becomes full it will loop forever. +* Thus, the map must be large enough to store every value. +* The map implements linear probing and keeps its load less than 0.5. +*/ + +#define MAP_EMPTY_VALUE ((U32)-1) +typedef struct COVER_map_pair_t_s { + U32 key; + U32 value; +} COVER_map_pair_t; + +typedef struct COVER_map_s { + COVER_map_pair_t *data; + U32 sizeLog; + U32 size; + U32 sizeMask; +} COVER_map_t; + +/** + * Clear the map. + */ +static void COVER_map_clear(COVER_map_t *map) { + memset(map->data, MAP_EMPTY_VALUE, map->size * sizeof(COVER_map_pair_t)); +} + +/** + * Initializes a map of the given size. + * Returns 1 on success and 0 on failure. + * The map must be destroyed with COVER_map_destroy(). + * The map is only guaranteed to be large enough to hold size elements. + */ +static int COVER_map_init(COVER_map_t *map, U32 size) { + map->sizeLog = ZSTD_highbit32(size) + 2; + map->size = (U32)1 << map->sizeLog; + map->sizeMask = map->size - 1; + map->data = (COVER_map_pair_t *)malloc(map->size * sizeof(COVER_map_pair_t)); + if (!map->data) { + map->sizeLog = 0; + map->size = 0; + return 0; + } + COVER_map_clear(map); + return 1; +} + +/** + * Internal hash function + */ +static const U32 prime4bytes = 2654435761U; +static U32 COVER_map_hash(COVER_map_t *map, U32 key) { + return (key * prime4bytes) >> (32 - map->sizeLog); +} + +/** + * Helper function that returns the index that a key should be placed into. + */ +static U32 COVER_map_index(COVER_map_t *map, U32 key) { + const U32 hash = COVER_map_hash(map, key); + U32 i; + for (i = hash;; i = (i + 1) & map->sizeMask) { + COVER_map_pair_t *pos = &map->data[i]; + if (pos->value == MAP_EMPTY_VALUE) { + return i; + } + if (pos->key == key) { + return i; + } + } +} + +/** + * Returns the pointer to the value for key. + * If key is not in the map, it is inserted and the value is set to 0. + * The map must not be full. + */ +static U32 *COVER_map_at(COVER_map_t *map, U32 key) { + COVER_map_pair_t *pos = &map->data[COVER_map_index(map, key)]; + if (pos->value == MAP_EMPTY_VALUE) { + pos->key = key; + pos->value = 0; + } + return &pos->value; +} + +/** + * Deletes key from the map if present. + */ +static void COVER_map_remove(COVER_map_t *map, U32 key) { + U32 i = COVER_map_index(map, key); + COVER_map_pair_t *del = &map->data[i]; + U32 shift = 1; + if (del->value == MAP_EMPTY_VALUE) { + return; + } + for (i = (i + 1) & map->sizeMask;; i = (i + 1) & map->sizeMask) { + COVER_map_pair_t *const pos = &map->data[i]; + /* If the position is empty we are done */ + if (pos->value == MAP_EMPTY_VALUE) { + del->value = MAP_EMPTY_VALUE; + return; + } + /* If pos can be moved to del do so */ + if (((i - COVER_map_hash(map, pos->key)) & map->sizeMask) >= shift) { + del->key = pos->key; + del->value = pos->value; + del = pos; + shift = 1; + } else { + ++shift; + } + } +} + +/** + * Destroyes a map that is inited with COVER_map_init(). + */ +static void COVER_map_destroy(COVER_map_t *map) { + if (map->data) { + free(map->data); + } + map->data = NULL; + map->size = 0; +} + +/*-************************************* +* Context +***************************************/ + +typedef struct { + const BYTE *samples; + size_t *offsets; + const size_t *samplesSizes; + size_t nbSamples; + U32 *suffix; + size_t suffixSize; + U32 *freqs; + U32 *dmerAt; + unsigned d; +} COVER_ctx_t; + +/* We need a global context for qsort... */ +static COVER_ctx_t *g_ctx = NULL; + +/*-************************************* +* Helper functions +***************************************/ + +/** + * Returns the sum of the sample sizes. + */ +static size_t COVER_sum(const size_t *samplesSizes, unsigned nbSamples) { + size_t sum = 0; + size_t i; + for (i = 0; i < nbSamples; ++i) { + sum += samplesSizes[i]; + } + return sum; +} + +/** + * Returns -1 if the dmer at lp is less than the dmer at rp. + * Return 0 if the dmers at lp and rp are equal. + * Returns 1 if the dmer at lp is greater than the dmer at rp. + */ +static int COVER_cmp(COVER_ctx_t *ctx, const void *lp, const void *rp) { + const U32 lhs = *(const U32 *)lp; + const U32 rhs = *(const U32 *)rp; + return memcmp(ctx->samples + lhs, ctx->samples + rhs, ctx->d); +} + +/** + * Same as COVER_cmp() except ties are broken by pointer value + * NOTE: g_ctx must be set to call this function. A global is required because + * qsort doesn't take an opaque pointer. + */ +static int COVER_strict_cmp(const void *lp, const void *rp) { + int result = COVER_cmp(g_ctx, lp, rp); + if (result == 0) { + result = lp < rp ? -1 : 1; + } + return result; +} + +/** + * Returns the first pointer in [first, last) whose element does not compare + * less than value. If no such element exists it returns last. + */ +static const size_t *COVER_lower_bound(const size_t *first, const size_t *last, + size_t value) { + size_t count = last - first; + while (count != 0) { + size_t step = count / 2; + const size_t *ptr = first; + ptr += step; + if (*ptr < value) { + first = ++ptr; + count -= step + 1; + } else { + count = step; + } + } + return first; +} + +/** + * Generic groupBy function. + * Groups an array sorted by cmp into groups with equivalent values. + * Calls grp for each group. + */ +static void +COVER_groupBy(const void *data, size_t count, size_t size, COVER_ctx_t *ctx, + int (*cmp)(COVER_ctx_t *, const void *, const void *), + void (*grp)(COVER_ctx_t *, const void *, const void *)) { + const BYTE *ptr = (const BYTE *)data; + size_t num = 0; + while (num < count) { + const BYTE *grpEnd = ptr + size; + ++num; + while (num < count && cmp(ctx, ptr, grpEnd) == 0) { + grpEnd += size; + ++num; + } + grp(ctx, ptr, grpEnd); + ptr = grpEnd; + } +} + +/*-************************************* +* Cover functions +***************************************/ + +/** + * Called on each group of positions with the same dmer. + * Counts the frequency of each dmer and saves it in the suffix array. + * Fills `ctx->dmerAt`. + */ +static void COVER_group(COVER_ctx_t *ctx, const void *group, + const void *groupEnd) { + /* The group consists of all the positions with the same first d bytes. */ + const U32 *grpPtr = (const U32 *)group; + const U32 *grpEnd = (const U32 *)groupEnd; + /* The dmerId is how we will reference this dmer. + * This allows us to map the whole dmer space to a much smaller space, the + * size of the suffix array. + */ + const U32 dmerId = (U32)(grpPtr - ctx->suffix); + /* Count the number of samples this dmer shows up in */ + U32 freq = 0; + /* Details */ + const size_t *curOffsetPtr = ctx->offsets; + const size_t *offsetsEnd = ctx->offsets + ctx->nbSamples; + /* Once *grpPtr >= curSampleEnd this occurrence of the dmer is in a + * different sample than the last. + */ + size_t curSampleEnd = ctx->offsets[0]; + for (; grpPtr != grpEnd; ++grpPtr) { + /* Save the dmerId for this position so we can get back to it. */ + ctx->dmerAt[*grpPtr] = dmerId; + /* Dictionaries only help for the first reference to the dmer. + * After that zstd can reference the match from the previous reference. + * So only count each dmer once for each sample it is in. + */ + if (*grpPtr < curSampleEnd) { + continue; + } + freq += 1; + /* Binary search to find the end of the sample *grpPtr is in. + * In the common case that grpPtr + 1 == grpEnd we can skip the binary + * search because the loop is over. + */ + if (grpPtr + 1 != grpEnd) { + const size_t *sampleEndPtr = + COVER_lower_bound(curOffsetPtr, offsetsEnd, *grpPtr); + curSampleEnd = *sampleEndPtr; + curOffsetPtr = sampleEndPtr + 1; + } + } + /* At this point we are never going to look at this segment of the suffix + * array again. We take advantage of this fact to save memory. + * We store the frequency of the dmer in the first position of the group, + * which is dmerId. + */ + ctx->suffix[dmerId] = freq; +} + +/** + * A segment is a range in the source as well as the score of the segment. + */ +typedef struct { + U32 begin; + U32 end; + double score; +} COVER_segment_t; + +/** + * Selects the best segment in an epoch. + * Segments of are scored according to the function: + * + * Let F(d) be the frequency of dmer d. + * Let S_i be the dmer at position i of segment S which has length k. + * + * Score(S) = F(S_1) + F(S_2) + ... + F(S_{k-d+1}) + * + * Once the dmer d is in the dictionay we set F(d) = 0. + */ +static COVER_segment_t COVER_selectSegment(const COVER_ctx_t *ctx, U32 *freqs, + COVER_map_t *activeDmers, U32 begin, + U32 end, COVER_params_t parameters) { + /* Constants */ + const U32 k = parameters.k; + const U32 d = parameters.d; + const U32 dmersInK = k - d + 1; + /* Try each segment (activeSegment) and save the best (bestSegment) */ + COVER_segment_t bestSegment = {0, 0, 0}; + COVER_segment_t activeSegment; + /* Reset the activeDmers in the segment */ + COVER_map_clear(activeDmers); + /* The activeSegment starts at the beginning of the epoch. */ + activeSegment.begin = begin; + activeSegment.end = begin; + activeSegment.score = 0; + /* Slide the activeSegment through the whole epoch. + * Save the best segment in bestSegment. + */ + while (activeSegment.end < end) { + /* The dmerId for the dmer at the next position */ + U32 newDmer = ctx->dmerAt[activeSegment.end]; + /* The entry in activeDmers for this dmerId */ + U32 *newDmerOcc = COVER_map_at(activeDmers, newDmer); + /* If the dmer isn't already present in the segment add its score. */ + if (*newDmerOcc == 0) { + /* The paper suggest using the L-0.5 norm, but experiments show that it + * doesn't help. + */ + activeSegment.score += freqs[newDmer]; + } + /* Add the dmer to the segment */ + activeSegment.end += 1; + *newDmerOcc += 1; + + /* If the window is now too large, drop the first position */ + if (activeSegment.end - activeSegment.begin == dmersInK + 1) { + U32 delDmer = ctx->dmerAt[activeSegment.begin]; + U32 *delDmerOcc = COVER_map_at(activeDmers, delDmer); + activeSegment.begin += 1; + *delDmerOcc -= 1; + /* If this is the last occurence of the dmer, subtract its score */ + if (*delDmerOcc == 0) { + COVER_map_remove(activeDmers, delDmer); + activeSegment.score -= freqs[delDmer]; + } + } + + /* If this segment is the best so far save it */ + if (activeSegment.score > bestSegment.score) { + bestSegment = activeSegment; + } + } + { + /* Trim off the zero frequency head and tail from the segment. */ + U32 newBegin = bestSegment.end; + U32 newEnd = bestSegment.begin; + U32 pos; + for (pos = bestSegment.begin; pos != bestSegment.end; ++pos) { + U32 freq = freqs[ctx->dmerAt[pos]]; + if (freq != 0) { + newBegin = MIN(newBegin, pos); + newEnd = pos + 1; + } + } + bestSegment.begin = newBegin; + bestSegment.end = newEnd; + } + { + /* Zero out the frequency of each dmer covered by the chosen segment. */ + U32 pos; + for (pos = bestSegment.begin; pos != bestSegment.end; ++pos) { + freqs[ctx->dmerAt[pos]] = 0; + } + } + return bestSegment; +} + +/** + * Check the validity of the parameters. + * Returns non-zero if the parameters are valid and 0 otherwise. + */ +static int COVER_checkParameters(COVER_params_t parameters) { + /* k and d are required parameters */ + if (parameters.d == 0 || parameters.k == 0) { + return 0; + } + /* d <= k */ + if (parameters.d > parameters.k) { + return 0; + } + return 1; +} + +/** + * Clean up a context initialized with `COVER_ctx_init()`. + */ +static void COVER_ctx_destroy(COVER_ctx_t *ctx) { + if (!ctx) { + return; + } + if (ctx->suffix) { + free(ctx->suffix); + ctx->suffix = NULL; + } + if (ctx->freqs) { + free(ctx->freqs); + ctx->freqs = NULL; + } + if (ctx->dmerAt) { + free(ctx->dmerAt); + ctx->dmerAt = NULL; + } + if (ctx->offsets) { + free(ctx->offsets); + ctx->offsets = NULL; + } +} + +/** + * Prepare a context for dictionary building. + * The context is only dependent on the parameter `d` and can used multiple + * times. + * Returns 1 on success or zero on error. + * The context must be destroyed with `COVER_ctx_destroy()`. + */ +static int COVER_ctx_init(COVER_ctx_t *ctx, const void *samplesBuffer, + const size_t *samplesSizes, unsigned nbSamples, + unsigned d) { + const BYTE *const samples = (const BYTE *)samplesBuffer; + const size_t totalSamplesSize = COVER_sum(samplesSizes, nbSamples); + /* Checks */ + if (totalSamplesSize < d || + totalSamplesSize >= (size_t)COVER_MAX_SAMPLES_SIZE) { + DISPLAYLEVEL(1, "Total samples size is too large, maximum size is %u MB\n", + (COVER_MAX_SAMPLES_SIZE >> 20)); + return 0; + } + /* Zero the context */ + memset(ctx, 0, sizeof(*ctx)); + DISPLAYLEVEL(2, "Training on %u samples of total size %u\n", nbSamples, + (U32)totalSamplesSize); + ctx->samples = samples; + ctx->samplesSizes = samplesSizes; + ctx->nbSamples = nbSamples; + /* Partial suffix array */ + ctx->suffixSize = totalSamplesSize - d + 1; + ctx->suffix = (U32 *)malloc(ctx->suffixSize * sizeof(U32)); + /* Maps index to the dmerID */ + ctx->dmerAt = (U32 *)malloc(ctx->suffixSize * sizeof(U32)); + /* The offsets of each file */ + ctx->offsets = (size_t *)malloc((nbSamples + 1) * sizeof(size_t)); + if (!ctx->suffix || !ctx->dmerAt || !ctx->offsets) { + DISPLAYLEVEL(1, "Failed to allocate scratch buffers\n"); + COVER_ctx_destroy(ctx); + return 0; + } + ctx->freqs = NULL; + ctx->d = d; + + /* Fill offsets from the samlesSizes */ + { + U32 i; + ctx->offsets[0] = 0; + for (i = 1; i <= nbSamples; ++i) { + ctx->offsets[i] = ctx->offsets[i - 1] + samplesSizes[i - 1]; + } + } + DISPLAYLEVEL(2, "Constructing partial suffix array\n"); + { + /* suffix is a partial suffix array. + * It only sorts suffixes by their first parameters.d bytes. + * The sort is stable, so each dmer group is sorted by position in input. + */ + U32 i; + for (i = 0; i < ctx->suffixSize; ++i) { + ctx->suffix[i] = i; + } + /* qsort doesn't take an opaque pointer, so pass as a global */ + g_ctx = ctx; + qsort(ctx->suffix, ctx->suffixSize, sizeof(U32), &COVER_strict_cmp); + } + DISPLAYLEVEL(2, "Computing frequencies\n"); + /* For each dmer group (group of positions with the same first d bytes): + * 1. For each position we set dmerAt[position] = dmerID. The dmerID is + * (groupBeginPtr - suffix). This allows us to go from position to + * dmerID so we can look up values in freq. + * 2. We calculate how many samples the dmer occurs in and save it in + * freqs[dmerId]. + */ + COVER_groupBy(ctx->suffix, ctx->suffixSize, sizeof(U32), ctx, &COVER_cmp, + &COVER_group); + ctx->freqs = ctx->suffix; + ctx->suffix = NULL; + return 1; +} + +/** + * Given the prepared context build the dictionary. + */ +static size_t COVER_buildDictionary(const COVER_ctx_t *ctx, U32 *freqs, + COVER_map_t *activeDmers, void *dictBuffer, + size_t dictBufferCapacity, + COVER_params_t parameters) { + BYTE *const dict = (BYTE *)dictBuffer; + size_t tail = dictBufferCapacity; + /* Divide the data up into epochs of equal size. + * We will select at least one segment from each epoch. + */ + const U32 epochs = (U32)(dictBufferCapacity / parameters.k); + const U32 epochSize = (U32)(ctx->suffixSize / epochs); + size_t epoch; + DISPLAYLEVEL(2, "Breaking content into %u epochs of size %u\n", epochs, + epochSize); + /* Loop through the epochs until there are no more segments or the dictionary + * is full. + */ + for (epoch = 0; tail > 0; epoch = (epoch + 1) % epochs) { + const U32 epochBegin = (U32)(epoch * epochSize); + const U32 epochEnd = epochBegin + epochSize; + size_t segmentSize; + /* Select a segment */ + COVER_segment_t segment = COVER_selectSegment( + ctx, freqs, activeDmers, epochBegin, epochEnd, parameters); + /* Trim the segment if necessary and if it is empty then we are done */ + segmentSize = MIN(segment.end - segment.begin + parameters.d - 1, tail); + if (segmentSize == 0) { + break; + } + /* We fill the dictionary from the back to allow the best segments to be + * referenced with the smallest offsets. + */ + tail -= segmentSize; + memcpy(dict + tail, ctx->samples + segment.begin, segmentSize); + DISPLAYUPDATE( + 2, "\r%u%% ", + (U32)(((dictBufferCapacity - tail) * 100) / dictBufferCapacity)); + } + DISPLAYLEVEL(2, "\r%79s\r", ""); + return tail; +} + +/** + * Translate from COVER_params_t to ZDICT_params_t required for finalizing the + * dictionary. + */ +static ZDICT_params_t COVER_translateParams(COVER_params_t parameters) { + ZDICT_params_t zdictParams; + memset(&zdictParams, 0, sizeof(zdictParams)); + zdictParams.notificationLevel = 1; + zdictParams.dictID = parameters.dictID; + zdictParams.compressionLevel = parameters.compressionLevel; + return zdictParams; +} + +/** + * Constructs a dictionary using a heuristic based on the following paper: + * + * Liao, Petri, Moffat, Wirth + * Effective Construction of Relative Lempel-Ziv Dictionaries + * Published in WWW 2016. + */ +ZDICTLIB_API size_t COVER_trainFromBuffer( + void *dictBuffer, size_t dictBufferCapacity, const void *samplesBuffer, + const size_t *samplesSizes, unsigned nbSamples, COVER_params_t parameters) { + BYTE *const dict = (BYTE *)dictBuffer; + COVER_ctx_t ctx; + COVER_map_t activeDmers; + /* Checks */ + if (!COVER_checkParameters(parameters)) { + DISPLAYLEVEL(1, "Cover parameters incorrect\n"); + return ERROR(GENERIC); + } + if (nbSamples == 0) { + DISPLAYLEVEL(1, "Cover must have at least one input file\n"); + return ERROR(GENERIC); + } + if (dictBufferCapacity < ZDICT_DICTSIZE_MIN) { + DISPLAYLEVEL(1, "dictBufferCapacity must be at least %u\n", + ZDICT_DICTSIZE_MIN); + return ERROR(dstSize_tooSmall); + } + /* Initialize global data */ + g_displayLevel = parameters.notificationLevel; + /* Initialize context and activeDmers */ + if (!COVER_ctx_init(&ctx, samplesBuffer, samplesSizes, nbSamples, + parameters.d)) { + return ERROR(GENERIC); + } + if (!COVER_map_init(&activeDmers, parameters.k - parameters.d + 1)) { + DISPLAYLEVEL(1, "Failed to allocate dmer map: out of memory\n"); + COVER_ctx_destroy(&ctx); + return ERROR(GENERIC); + } + + DISPLAYLEVEL(2, "Building dictionary\n"); + { + const size_t tail = + COVER_buildDictionary(&ctx, ctx.freqs, &activeDmers, dictBuffer, + dictBufferCapacity, parameters); + ZDICT_params_t zdictParams = COVER_translateParams(parameters); + const size_t dictionarySize = ZDICT_finalizeDictionary( + dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail, + samplesBuffer, samplesSizes, nbSamples, zdictParams); + if (!ZSTD_isError(dictionarySize)) { + DISPLAYLEVEL(2, "Constructed dictionary of size %u\n", + (U32)dictionarySize); + } + COVER_ctx_destroy(&ctx); + COVER_map_destroy(&activeDmers); + return dictionarySize; + } +} + +/** + * COVER_best_t is used for two purposes: + * 1. Synchronizing threads. + * 2. Saving the best parameters and dictionary. + * + * All of the methods except COVER_best_init() are thread safe if zstd is + * compiled with multithreaded support. + */ +typedef struct COVER_best_s { + pthread_mutex_t mutex; + pthread_cond_t cond; + size_t liveJobs; + void *dict; + size_t dictSize; + COVER_params_t parameters; + size_t compressedSize; +} COVER_best_t; + +/** + * Initialize the `COVER_best_t`. + */ +static void COVER_best_init(COVER_best_t *best) { + if (!best) { + return; + } + pthread_mutex_init(&best->mutex, NULL); + pthread_cond_init(&best->cond, NULL); + best->liveJobs = 0; + best->dict = NULL; + best->dictSize = 0; + best->compressedSize = (size_t)-1; + memset(&best->parameters, 0, sizeof(best->parameters)); +} + +/** + * Wait until liveJobs == 0. + */ +static void COVER_best_wait(COVER_best_t *best) { + if (!best) { + return; + } + pthread_mutex_lock(&best->mutex); + while (best->liveJobs != 0) { + pthread_cond_wait(&best->cond, &best->mutex); + } + pthread_mutex_unlock(&best->mutex); +} + +/** + * Call COVER_best_wait() and then destroy the COVER_best_t. + */ +static void COVER_best_destroy(COVER_best_t *best) { + if (!best) { + return; + } + COVER_best_wait(best); + if (best->dict) { + free(best->dict); + } + pthread_mutex_destroy(&best->mutex); + pthread_cond_destroy(&best->cond); +} + +/** + * Called when a thread is about to be launched. + * Increments liveJobs. + */ +static void COVER_best_start(COVER_best_t *best) { + if (!best) { + return; + } + pthread_mutex_lock(&best->mutex); + ++best->liveJobs; + pthread_mutex_unlock(&best->mutex); +} + +/** + * Called when a thread finishes executing, both on error or success. + * Decrements liveJobs and signals any waiting threads if liveJobs == 0. + * If this dictionary is the best so far save it and its parameters. + */ +static void COVER_best_finish(COVER_best_t *best, size_t compressedSize, + COVER_params_t parameters, void *dict, + size_t dictSize) { + if (!best) { + return; + } + { + size_t liveJobs; + pthread_mutex_lock(&best->mutex); + --best->liveJobs; + liveJobs = best->liveJobs; + /* If the new dictionary is better */ + if (compressedSize < best->compressedSize) { + /* Allocate space if necessary */ + if (!best->dict || best->dictSize < dictSize) { + if (best->dict) { + free(best->dict); + } + best->dict = malloc(dictSize); + if (!best->dict) { + best->compressedSize = ERROR(GENERIC); + best->dictSize = 0; + return; + } + } + /* Save the dictionary, parameters, and size */ + memcpy(best->dict, dict, dictSize); + best->dictSize = dictSize; + best->parameters = parameters; + best->compressedSize = compressedSize; + } + pthread_mutex_unlock(&best->mutex); + if (liveJobs == 0) { + pthread_cond_broadcast(&best->cond); + } + } +} + +/** + * Parameters for COVER_tryParameters(). + */ +typedef struct COVER_tryParameters_data_s { + const COVER_ctx_t *ctx; + COVER_best_t *best; + size_t dictBufferCapacity; + COVER_params_t parameters; +} COVER_tryParameters_data_t; + +/** + * Tries a set of parameters and upates the COVER_best_t with the results. + * This function is thread safe if zstd is compiled with multithreaded support. + * It takes its parameters as an *OWNING* opaque pointer to support threading. + */ +static void COVER_tryParameters(void *opaque) { + /* Save parameters as local variables */ + COVER_tryParameters_data_t *const data = (COVER_tryParameters_data_t *)opaque; + const COVER_ctx_t *const ctx = data->ctx; + const COVER_params_t parameters = data->parameters; + size_t dictBufferCapacity = data->dictBufferCapacity; + size_t totalCompressedSize = ERROR(GENERIC); + /* Allocate space for hash table, dict, and freqs */ + COVER_map_t activeDmers; + BYTE *const dict = (BYTE * const)malloc(dictBufferCapacity); + U32 *freqs = (U32 *)malloc(ctx->suffixSize * sizeof(U32)); + if (!COVER_map_init(&activeDmers, parameters.k - parameters.d + 1)) { + DISPLAYLEVEL(1, "Failed to allocate dmer map: out of memory\n"); + goto _cleanup; + } + if (!dict || !freqs) { + DISPLAYLEVEL(1, "Failed to allocate buffers: out of memory\n"); + goto _cleanup; + } + /* Copy the frequencies because we need to modify them */ + memcpy(freqs, ctx->freqs, ctx->suffixSize * sizeof(U32)); + /* Build the dictionary */ + { + const size_t tail = COVER_buildDictionary(ctx, freqs, &activeDmers, dict, + dictBufferCapacity, parameters); + const ZDICT_params_t zdictParams = COVER_translateParams(parameters); + dictBufferCapacity = ZDICT_finalizeDictionary( + dict, dictBufferCapacity, dict + tail, dictBufferCapacity - tail, + ctx->samples, ctx->samplesSizes, (unsigned)ctx->nbSamples, zdictParams); + if (ZDICT_isError(dictBufferCapacity)) { + DISPLAYLEVEL(1, "Failed to finalize dictionary\n"); + goto _cleanup; + } + } + /* Check total compressed size */ + { + /* Pointers */ + ZSTD_CCtx *cctx; + ZSTD_CDict *cdict; + void *dst; + /* Local variables */ + size_t dstCapacity; + size_t i; + /* Allocate dst with enough space to compress the maximum sized sample */ + { + size_t maxSampleSize = 0; + for (i = 0; i < ctx->nbSamples; ++i) { + maxSampleSize = MAX(ctx->samplesSizes[i], maxSampleSize); + } + dstCapacity = ZSTD_compressBound(maxSampleSize); + dst = malloc(dstCapacity); + } + /* Create the cctx and cdict */ + cctx = ZSTD_createCCtx(); + cdict = + ZSTD_createCDict(dict, dictBufferCapacity, parameters.compressionLevel); + if (!dst || !cctx || !cdict) { + goto _compressCleanup; + } + /* Compress each sample and sum their sizes (or error) */ + totalCompressedSize = 0; + for (i = 0; i < ctx->nbSamples; ++i) { + const size_t size = ZSTD_compress_usingCDict( + cctx, dst, dstCapacity, ctx->samples + ctx->offsets[i], + ctx->samplesSizes[i], cdict); + if (ZSTD_isError(size)) { + totalCompressedSize = ERROR(GENERIC); + goto _compressCleanup; + } + totalCompressedSize += size; + } + _compressCleanup: + ZSTD_freeCCtx(cctx); + ZSTD_freeCDict(cdict); + if (dst) { + free(dst); + } + } + +_cleanup: + COVER_best_finish(data->best, totalCompressedSize, parameters, dict, + dictBufferCapacity); + free(data); + COVER_map_destroy(&activeDmers); + if (dict) { + free(dict); + } + if (freqs) { + free(freqs); + } +} + +ZDICTLIB_API size_t COVER_optimizeTrainFromBuffer(void *dictBuffer, + size_t dictBufferCapacity, + const void *samplesBuffer, + const size_t *samplesSizes, + unsigned nbSamples, + COVER_params_t *parameters) { + /* constants */ + const unsigned nbThreads = parameters->nbThreads; + const unsigned kMinD = parameters->d == 0 ? 6 : parameters->d; + const unsigned kMaxD = parameters->d == 0 ? 16 : parameters->d; + const unsigned kMinK = parameters->k == 0 ? kMaxD : parameters->k; + const unsigned kMaxK = parameters->k == 0 ? 2048 : parameters->k; + const unsigned kSteps = parameters->steps == 0 ? 32 : parameters->steps; + const unsigned kStepSize = MAX((kMaxK - kMinK) / kSteps, 1); + const unsigned kIterations = + (1 + (kMaxD - kMinD) / 2) * (1 + (kMaxK - kMinK) / kStepSize); + /* Local variables */ + const int displayLevel = parameters->notificationLevel; + unsigned iteration = 1; + unsigned d; + unsigned k; + COVER_best_t best; + POOL_ctx *pool = NULL; + /* Checks */ + if (kMinK < kMaxD || kMaxK < kMinK) { + LOCALDISPLAYLEVEL(displayLevel, 1, "Incorrect parameters\n"); + return ERROR(GENERIC); + } + if (nbSamples == 0) { + DISPLAYLEVEL(1, "Cover must have at least one input file\n"); + return ERROR(GENERIC); + } + if (dictBufferCapacity < ZDICT_DICTSIZE_MIN) { + DISPLAYLEVEL(1, "dictBufferCapacity must be at least %u\n", + ZDICT_DICTSIZE_MIN); + return ERROR(dstSize_tooSmall); + } + if (nbThreads > 1) { + pool = POOL_create(nbThreads, 1); + if (!pool) { + return ERROR(memory_allocation); + } + } + /* Initialization */ + COVER_best_init(&best); + /* Turn down global display level to clean up display at level 2 and below */ + g_displayLevel = parameters->notificationLevel - 1; + /* Loop through d first because each new value needs a new context */ + LOCALDISPLAYLEVEL(displayLevel, 2, "Trying %u different sets of parameters\n", + kIterations); + for (d = kMinD; d <= kMaxD; d += 2) { + /* Initialize the context for this value of d */ + COVER_ctx_t ctx; + LOCALDISPLAYLEVEL(displayLevel, 3, "d=%u\n", d); + if (!COVER_ctx_init(&ctx, samplesBuffer, samplesSizes, nbSamples, d)) { + LOCALDISPLAYLEVEL(displayLevel, 1, "Failed to initialize context\n"); + COVER_best_destroy(&best); + return ERROR(GENERIC); + } + /* Loop through k reusing the same context */ + for (k = kMinK; k <= kMaxK; k += kStepSize) { + /* Prepare the arguments */ + COVER_tryParameters_data_t *data = (COVER_tryParameters_data_t *)malloc( + sizeof(COVER_tryParameters_data_t)); + LOCALDISPLAYLEVEL(displayLevel, 3, "k=%u\n", k); + if (!data) { + LOCALDISPLAYLEVEL(displayLevel, 1, "Failed to allocate parameters\n"); + COVER_best_destroy(&best); + COVER_ctx_destroy(&ctx); + return ERROR(GENERIC); + } + data->ctx = &ctx; + data->best = &best; + data->dictBufferCapacity = dictBufferCapacity; + data->parameters = *parameters; + data->parameters.k = k; + data->parameters.d = d; + data->parameters.steps = kSteps; + /* Check the parameters */ + if (!COVER_checkParameters(data->parameters)) { + DISPLAYLEVEL(1, "Cover parameters incorrect\n"); + continue; + } + /* Call the function and pass ownership of data to it */ + COVER_best_start(&best); + if (pool) { + POOL_add(pool, &COVER_tryParameters, data); + } else { + COVER_tryParameters(data); + } + /* Print status */ + LOCALDISPLAYUPDATE(displayLevel, 2, "\r%u%% ", + (U32)((iteration * 100) / kIterations)); + ++iteration; + } + COVER_best_wait(&best); + COVER_ctx_destroy(&ctx); + } + LOCALDISPLAYLEVEL(displayLevel, 2, "\r%79s\r", ""); + /* Fill the output buffer and parameters with output of the best parameters */ + { + const size_t dictSize = best.dictSize; + if (ZSTD_isError(best.compressedSize)) { + COVER_best_destroy(&best); + return best.compressedSize; + } + *parameters = best.parameters; + memcpy(dictBuffer, best.dict, dictSize); + COVER_best_destroy(&best); + POOL_free(pool); + return dictSize; + } +} diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/dictBuilder/zdict.c --- a/contrib/python-zstandard/zstd/dictBuilder/zdict.c Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/dictBuilder/zdict.c Fri Mar 24 08:37:26 2017 -0700 @@ -36,12 +36,11 @@ #include /* clock */ #include "mem.h" /* read */ -#include "error_private.h" #include "fse.h" /* FSE_normalizeCount, FSE_writeNCount */ #define HUF_STATIC_LINKING_ONLY -#include "huf.h" +#include "huf.h" /* HUF_buildCTable, HUF_writeCTable */ #include "zstd_internal.h" /* includes zstd.h */ -#include "xxhash.h" +#include "xxhash.h" /* XXH64 */ #include "divsufsort.h" #ifndef ZDICT_STATIC_LINKING_ONLY # define ZDICT_STATIC_LINKING_ONLY @@ -61,7 +60,7 @@ #define NOISELENGTH 32 #define MINRATIO 4 -static const int g_compressionLevel_default = 5; +static const int g_compressionLevel_default = 6; static const U32 g_selectivity_default = 9; static const size_t g_provision_entropySize = 200; static const size_t g_min_fast_dictContent = 192; @@ -307,13 +306,13 @@ } while (length >=MINMATCHLENGTH); /* look backward */ - length = MINMATCHLENGTH; - while ((length >= MINMATCHLENGTH) & (start > 0)) { - length = ZDICT_count(b + pos, b + suffix[start - 1]); - if (length >= LLIMIT) length = LLIMIT - 1; - lengthList[length]++; - if (length >= MINMATCHLENGTH) start--; - } + length = MINMATCHLENGTH; + while ((length >= MINMATCHLENGTH) & (start > 0)) { + length = ZDICT_count(b + pos, b + suffix[start - 1]); + if (length >= LLIMIT) length = LLIMIT - 1; + lengthList[length]++; + if (length >= MINMATCHLENGTH) start--; + } /* largest useful length */ memset(cumulLength, 0, sizeof(cumulLength)); @@ -570,7 +569,7 @@ if (ZSTD_isError(errorCode)) { DISPLAYLEVEL(1, "warning : ZSTD_copyCCtx failed \n"); return; } } cSize = ZSTD_compressBlock(esr.zc, esr.workPlace, ZSTD_BLOCKSIZE_ABSOLUTEMAX, src, srcSize); - if (ZSTD_isError(cSize)) { DISPLAYLEVEL(1, "warning : could not compress sample size %u \n", (U32)srcSize); return; } + if (ZSTD_isError(cSize)) { DISPLAYLEVEL(3, "warning : could not compress sample size %u \n", (U32)srcSize); return; } if (cSize) { /* if == 0; block is not compressible */ const seqStore_t* seqStorePtr = ZSTD_getSeqStore(esr.zc); @@ -825,6 +824,55 @@ } + +size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBufferCapacity, + const void* customDictContent, size_t dictContentSize, + const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, + ZDICT_params_t params) +{ + size_t hSize; +#define HBUFFSIZE 256 + BYTE header[HBUFFSIZE]; + int const compressionLevel = (params.compressionLevel <= 0) ? g_compressionLevel_default : params.compressionLevel; + U32 const notificationLevel = params.notificationLevel; + + /* check conditions */ + if (dictBufferCapacity < dictContentSize) return ERROR(dstSize_tooSmall); + if (dictContentSize < ZDICT_CONTENTSIZE_MIN) return ERROR(srcSize_wrong); + if (dictBufferCapacity < ZDICT_DICTSIZE_MIN) return ERROR(dstSize_tooSmall); + + /* dictionary header */ + MEM_writeLE32(header, ZSTD_DICT_MAGIC); + { U64 const randomID = XXH64(customDictContent, dictContentSize, 0); + U32 const compliantID = (randomID % ((1U<<31)-32768)) + 32768; + U32 const dictID = params.dictID ? params.dictID : compliantID; + MEM_writeLE32(header+4, dictID); + } + hSize = 8; + + /* entropy tables */ + DISPLAYLEVEL(2, "\r%70s\r", ""); /* clean display line */ + DISPLAYLEVEL(2, "statistics ... \n"); + { size_t const eSize = ZDICT_analyzeEntropy(header+hSize, HBUFFSIZE-hSize, + compressionLevel, + samplesBuffer, samplesSizes, nbSamples, + customDictContent, dictContentSize, + notificationLevel); + if (ZDICT_isError(eSize)) return eSize; + hSize += eSize; + } + + /* copy elements in final buffer ; note : src and dst buffer can overlap */ + if (hSize + dictContentSize > dictBufferCapacity) dictContentSize = dictBufferCapacity - hSize; + { size_t const dictSize = hSize + dictContentSize; + char* dictEnd = (char*)dictBuffer + dictSize; + memmove(dictEnd - dictContentSize, customDictContent, dictContentSize); + memcpy(dictBuffer, header, hSize); + return dictSize; + } +} + + size_t ZDICT_addEntropyTablesFromBuffer_advanced(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity, const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, ZDICT_params_t params) diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/dictBuilder/zdict.h --- a/contrib/python-zstandard/zstd/dictBuilder/zdict.h Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/dictBuilder/zdict.h Fri Mar 24 08:37:26 2017 -0700 @@ -19,15 +19,18 @@ #include /* size_t */ -/*====== Export for Windows ======*/ -/*! -* ZSTD_DLL_EXPORT : -* Enable exporting of functions when building a Windows DLL -*/ -#if defined(_WIN32) && defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) -# define ZDICTLIB_API __declspec(dllexport) +/* ===== ZDICTLIB_API : control library symbols visibility ===== */ +#if defined(__GNUC__) && (__GNUC__ >= 4) +# define ZDICTLIB_VISIBILITY __attribute__ ((visibility ("default"))) #else -# define ZDICTLIB_API +# define ZDICTLIB_VISIBILITY +#endif +#if defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) +# define ZDICTLIB_API __declspec(dllexport) ZDICTLIB_VISIBILITY +#elif defined(ZSTD_DLL_IMPORT) && (ZSTD_DLL_IMPORT==1) +# define ZDICTLIB_API __declspec(dllimport) ZDICTLIB_VISIBILITY /* It isn't required but allows to generate better code, saving a function pointer load from the IAT and an indirect jump.*/ +#else +# define ZDICTLIB_API ZDICTLIB_VISIBILITY #endif @@ -79,27 +82,114 @@ or an error code, which can be tested by ZDICT_isError(). note : ZDICT_trainFromBuffer_advanced() will send notifications into stderr if instructed to, using notificationLevel>0. */ -size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity, +ZDICTLIB_API size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity, + const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, + ZDICT_params_t parameters); + +/*! COVER_params_t : + For all values 0 means default. + kMin and d are the only required parameters. +*/ +typedef struct { + unsigned k; /* Segment size : constraint: 0 < k : Reasonable range [16, 2048+] */ + unsigned d; /* dmer size : constraint: 0 < d <= k : Reasonable range [6, 16] */ + unsigned steps; /* Number of steps : Only used for optimization : 0 means default (32) : Higher means more parameters checked */ + + unsigned nbThreads; /* Number of threads : constraint: 0 < nbThreads : 1 means single-threaded : Only used for optimization : Ignored if ZSTD_MULTITHREAD is not defined */ + unsigned notificationLevel; /* Write to stderr; 0 = none (default); 1 = errors; 2 = progression; 3 = details; 4 = debug; */ + unsigned dictID; /* 0 means auto mode (32-bits random value); other : force dictID value */ + int compressionLevel; /* 0 means default; target a specific zstd compression level */ +} COVER_params_t; + + +/*! COVER_trainFromBuffer() : + Train a dictionary from an array of samples using the COVER algorithm. + Samples must be stored concatenated in a single flat buffer `samplesBuffer`, + supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order. + The resulting dictionary will be saved into `dictBuffer`. + @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) + or an error code, which can be tested with ZDICT_isError(). + Note : COVER_trainFromBuffer() requires about 9 bytes of memory for each input byte. + Tips : In general, a reasonable dictionary has a size of ~ 100 KB. + It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`. + In general, it's recommended to provide a few thousands samples, but this can vary a lot. + It's recommended that total size of all samples be about ~x100 times the target size of dictionary. +*/ +ZDICTLIB_API size_t COVER_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity, + const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, + COVER_params_t parameters); + +/*! COVER_optimizeTrainFromBuffer() : + The same requirements as above hold for all the parameters except `parameters`. + This function tries many parameter combinations and picks the best parameters. + `*parameters` is filled with the best parameters found, and the dictionary + constructed with those parameters is stored in `dictBuffer`. + + All of the parameters d, k, steps are optional. + If d is non-zero then we don't check multiple values of d, otherwise we check d = {6, 8, 10, 12, 14, 16}. + if steps is zero it defaults to its default value. + If k is non-zero then we don't check multiple values of k, otherwise we check steps values in [16, 2048]. + + @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`) + or an error code, which can be tested with ZDICT_isError(). + On success `*parameters` contains the parameters selected. + Note : COVER_optimizeTrainFromBuffer() requires about 8 bytes of memory for each input byte and additionally another 5 bytes of memory for each byte of memory for each thread. +*/ +ZDICTLIB_API size_t COVER_optimizeTrainFromBuffer(void* dictBuffer, size_t dictBufferCapacity, + const void* samplesBuffer, const size_t *samplesSizes, unsigned nbSamples, + COVER_params_t *parameters); + +/*! ZDICT_finalizeDictionary() : + + Given a custom content as a basis for dictionary, and a set of samples, + finalize dictionary by adding headers and statistics. + + Samples must be stored concatenated in a flat buffer `samplesBuffer`, + supplied with an array of sizes `samplesSizes`, providing the size of each sample in order. + + dictContentSize must be > ZDICT_CONTENTSIZE_MIN bytes. + maxDictSize must be >= dictContentSize, and must be > ZDICT_DICTSIZE_MIN bytes. + + @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`), + or an error code, which can be tested by ZDICT_isError(). + note : ZDICT_finalizeDictionary() will push notifications into stderr if instructed to, using notificationLevel>0. + note 2 : dictBuffer and customDictContent can overlap +*/ +#define ZDICT_CONTENTSIZE_MIN 256 +#define ZDICT_DICTSIZE_MIN 512 +ZDICTLIB_API size_t ZDICT_finalizeDictionary(void* dictBuffer, size_t dictBufferCapacity, + const void* customDictContent, size_t dictContentSize, const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples, ZDICT_params_t parameters); -/*! ZDICT_addEntropyTablesFromBuffer() : - - Given a content-only dictionary (built using any 3rd party algorithm), - add entropy tables computed from an array of samples. - Samples must be stored concatenated in a flat buffer `samplesBuffer`, - supplied with an array of sizes `samplesSizes`, providing the size of each sample in order. - The input dictionary content must be stored *at the end* of `dictBuffer`. - Its size is `dictContentSize`. - The resulting dictionary with added entropy tables will be *written back to `dictBuffer`*, - starting from its beginning. - @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`). -*/ +/* Deprecation warnings */ +/* It is generally possible to disable deprecation warnings from compiler, + for example with -Wno-deprecated-declarations for gcc + or _CRT_SECURE_NO_WARNINGS in Visual. + Otherwise, it's also possible to manually define ZDICT_DISABLE_DEPRECATE_WARNINGS */ +#ifdef ZDICT_DISABLE_DEPRECATE_WARNINGS +# define ZDICT_DEPRECATED(message) ZDICTLIB_API /* disable deprecation warnings */ +#else +# define ZDICT_GCC_VERSION (__GNUC__ * 100 + __GNUC_MINOR__) +# if defined (__cplusplus) && (__cplusplus >= 201402) /* C++14 or greater */ +# define ZDICT_DEPRECATED(message) ZDICTLIB_API [[deprecated(message)]] +# elif (ZDICT_GCC_VERSION >= 405) || defined(__clang__) +# define ZDICT_DEPRECATED(message) ZDICTLIB_API __attribute__((deprecated(message))) +# elif (ZDICT_GCC_VERSION >= 301) +# define ZDICT_DEPRECATED(message) ZDICTLIB_API __attribute__((deprecated)) +# elif defined(_MSC_VER) +# define ZDICT_DEPRECATED(message) ZDICTLIB_API __declspec(deprecated(message)) +# else +# pragma message("WARNING: You need to implement ZDICT_DEPRECATED for this compiler") +# define ZDICT_DEPRECATED(message) ZDICTLIB_API +# endif +#endif /* ZDICT_DISABLE_DEPRECATE_WARNINGS */ + +ZDICT_DEPRECATED("use ZDICT_finalizeDictionary() instead") size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity, - const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples); - + const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples); #endif /* ZDICT_STATIC_LINKING_ONLY */ diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd/zstd.h --- a/contrib/python-zstandard/zstd/zstd.h Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd/zstd.h Fri Mar 24 08:37:26 2017 -0700 @@ -20,13 +20,16 @@ /* ===== ZSTDLIB_API : control library symbols visibility ===== */ #if defined(__GNUC__) && (__GNUC__ >= 4) -# define ZSTDLIB_API __attribute__ ((visibility ("default"))) -#elif defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) -# define ZSTDLIB_API __declspec(dllexport) +# define ZSTDLIB_VISIBILITY __attribute__ ((visibility ("default"))) +#else +# define ZSTDLIB_VISIBILITY +#endif +#if defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1) +# define ZSTDLIB_API __declspec(dllexport) ZSTDLIB_VISIBILITY #elif defined(ZSTD_DLL_IMPORT) && (ZSTD_DLL_IMPORT==1) -# define ZSTDLIB_API __declspec(dllimport) /* It isn't required but allows to generate better code, saving a function pointer load from the IAT and an indirect jump.*/ +# define ZSTDLIB_API __declspec(dllimport) ZSTDLIB_VISIBILITY /* It isn't required but allows to generate better code, saving a function pointer load from the IAT and an indirect jump.*/ #else -# define ZSTDLIB_API +# define ZSTDLIB_API ZSTDLIB_VISIBILITY #endif @@ -53,7 +56,7 @@ /*------ Version ------*/ #define ZSTD_VERSION_MAJOR 1 #define ZSTD_VERSION_MINOR 1 -#define ZSTD_VERSION_RELEASE 2 +#define ZSTD_VERSION_RELEASE 3 #define ZSTD_LIB_VERSION ZSTD_VERSION_MAJOR.ZSTD_VERSION_MINOR.ZSTD_VERSION_RELEASE #define ZSTD_QUOTE(str) #str @@ -170,8 +173,8 @@ * When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once. * ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay. * ZSTD_CDict can be created once and used by multiple threads concurrently, as its usage is read-only. -* `dict` can be released after ZSTD_CDict creation. */ -ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dict, size_t dictSize, int compressionLevel); +* `dictBuffer` can be released after ZSTD_CDict creation, as its content is copied within CDict */ +ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize, int compressionLevel); /*! ZSTD_freeCDict() : * Function frees memory allocated by ZSTD_createCDict(). */ @@ -191,8 +194,8 @@ /*! ZSTD_createDDict() : * Create a digested dictionary, ready to start decompression operation without startup delay. -* `dict` can be released after creation. */ -ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict(const void* dict, size_t dictSize); +* dictBuffer can be released after DDict creation, as its content is copied inside DDict */ +ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict(const void* dictBuffer, size_t dictSize); /*! ZSTD_freeDDict() : * Function frees memory allocated with ZSTD_createDDict() */ @@ -325,7 +328,7 @@ * ***************************************************************************************/ /* --- Constants ---*/ -#define ZSTD_MAGICNUMBER 0xFD2FB528 /* v0.8 */ +#define ZSTD_MAGICNUMBER 0xFD2FB528 /* >= v0.8.0 */ #define ZSTD_MAGIC_SKIPPABLE_START 0x184D2A50U #define ZSTD_WINDOWLOG_MAX_32 25 @@ -345,8 +348,9 @@ #define ZSTD_TARGETLENGTH_MAX 999 #define ZSTD_FRAMEHEADERSIZE_MAX 18 /* for static allocation */ +#define ZSTD_FRAMEHEADERSIZE_MIN 6 static const size_t ZSTD_frameHeaderSize_prefix = 5; -static const size_t ZSTD_frameHeaderSize_min = 6; +static const size_t ZSTD_frameHeaderSize_min = ZSTD_FRAMEHEADERSIZE_MIN; static const size_t ZSTD_frameHeaderSize_max = ZSTD_FRAMEHEADERSIZE_MAX; static const size_t ZSTD_skippableHeaderSize = 8; /* magic number + skippable frame length */ @@ -365,9 +369,9 @@ } ZSTD_compressionParameters; typedef struct { - unsigned contentSizeFlag; /**< 1: content size will be in frame header (if known). */ - unsigned checksumFlag; /**< 1: will generate a 22-bits checksum at end of frame, to be used for error detection by decompressor */ - unsigned noDictIDFlag; /**< 1: no dict ID will be saved into frame header (if dictionary compression) */ + unsigned contentSizeFlag; /**< 1: content size will be in frame header (when known) */ + unsigned checksumFlag; /**< 1: generate a 32-bits checksum at end of frame, for error detection */ + unsigned noDictIDFlag; /**< 1: no dictID will be saved into frame header (if dictionary compression) */ } ZSTD_frameParameters; typedef struct { @@ -397,9 +401,23 @@ * Gives the amount of memory used by a given ZSTD_CCtx */ ZSTDLIB_API size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx); +typedef enum { + ZSTD_p_forceWindow /* Force back-references to remain < windowSize, even when referencing Dictionary content (default:0)*/ +} ZSTD_CCtxParameter; +/*! ZSTD_setCCtxParameter() : + * Set advanced parameters, selected through enum ZSTD_CCtxParameter + * @result : 0, or an error code (which can be tested with ZSTD_isError()) */ +ZSTDLIB_API size_t ZSTD_setCCtxParameter(ZSTD_CCtx* cctx, ZSTD_CCtxParameter param, unsigned value); + +/*! ZSTD_createCDict_byReference() : + * Create a digested dictionary for compression + * Dictionary content is simply referenced, and therefore stays in dictBuffer. + * It is important that dictBuffer outlives CDict, it must remain read accessible throughout the lifetime of CDict */ +ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_byReference(const void* dictBuffer, size_t dictSize, int compressionLevel); + /*! ZSTD_createCDict_advanced() : * Create a ZSTD_CDict using external alloc and free, and customized compression parameters */ -ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, +ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, unsigned byReference, ZSTD_parameters params, ZSTD_customMem customMem); /*! ZSTD_sizeof_CDict() : @@ -455,6 +473,15 @@ * Gives the amount of memory used by a given ZSTD_DCtx */ ZSTDLIB_API size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx); +/*! ZSTD_createDDict_byReference() : + * Create a digested dictionary, ready to start decompression operation without startup delay. + * Dictionary content is simply referenced, and therefore stays in dictBuffer. + * It is important that dictBuffer outlives DDict, it must remain read accessible throughout the lifetime of DDict */ +ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict_byReference(const void* dictBuffer, size_t dictSize); + +ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, + unsigned byReference, ZSTD_customMem customMem); + /*! ZSTD_sizeof_DDict() : * Gives the amount of memory used by a given ZSTD_DDict */ ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); @@ -463,13 +490,13 @@ * Provides the dictID stored within dictionary. * if @return == 0, the dictionary is not conformant with Zstandard specification. * It can still be loaded, but as a content-only dictionary. */ -unsigned ZSTD_getDictID_fromDict(const void* dict, size_t dictSize); +ZSTDLIB_API unsigned ZSTD_getDictID_fromDict(const void* dict, size_t dictSize); /*! ZSTD_getDictID_fromDDict() : * Provides the dictID of the dictionary loaded into `ddict`. * If @return == 0, the dictionary is not conformant to Zstandard specification, or empty. * Non-conformant dictionaries can still be loaded, but as content-only dictionaries. */ -unsigned ZSTD_getDictID_fromDDict(const ZSTD_DDict* ddict); +ZSTDLIB_API unsigned ZSTD_getDictID_fromDDict(const ZSTD_DDict* ddict); /*! ZSTD_getDictID_fromFrame() : * Provides the dictID required to decompressed the frame stored within `src`. @@ -481,7 +508,7 @@ * - `srcSize` is too small, and as a result, the frame header could not be decoded (only possible if `srcSize < ZSTD_FRAMEHEADERSIZE_MAX`). * - This is not a Zstandard frame. * When identifying the exact failure cause, it's possible to used ZSTD_getFrameParams(), which will provide a more precise error code. */ -unsigned ZSTD_getDictID_fromFrame(const void* src, size_t srcSize); +ZSTDLIB_API unsigned ZSTD_getDictID_fromFrame(const void* src, size_t srcSize); /******************************************************************** @@ -491,7 +518,7 @@ /*===== Advanced Streaming compression functions =====*/ ZSTDLIB_API ZSTD_CStream* ZSTD_createCStream_advanced(ZSTD_customMem customMem); ZSTDLIB_API size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize); /**< pledgedSrcSize must be correct */ -ZSTDLIB_API size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); +ZSTDLIB_API size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); /**< note: a dict will not be used if dict == NULL or dictSize < 8 */ ZSTDLIB_API size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pledgedSrcSize); /**< pledgedSrcSize is optional and can be zero == unknown */ ZSTDLIB_API size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict); /**< note : cdict will just be referenced, and must outlive compression session */ @@ -500,9 +527,9 @@ /*===== Advanced Streaming decompression functions =====*/ -typedef enum { ZSTDdsp_maxWindowSize } ZSTD_DStreamParameter_e; +typedef enum { DStream_p_maxWindowSize } ZSTD_DStreamParameter_e; ZSTDLIB_API ZSTD_DStream* ZSTD_createDStream_advanced(ZSTD_customMem customMem); -ZSTDLIB_API size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); +ZSTDLIB_API size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); /**< note: a dict will not be used if dict == NULL or dictSize < 8 */ ZSTDLIB_API size_t ZSTD_setDStreamParameter(ZSTD_DStream* zds, ZSTD_DStreamParameter_e paramType, unsigned paramValue); ZSTDLIB_API size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict); /**< note : ddict will just be referenced, and must outlive decompression session */ ZSTDLIB_API size_t ZSTD_resetDStream(ZSTD_DStream* zds); /**< re-use decompression parameters from previous init; saves dictionary loading */ @@ -542,10 +569,10 @@ In which case, it will "discard" the relevant memory section from its history. Finish a frame with ZSTD_compressEnd(), which will write the last block(s) and optional checksum. - It's possible to use a NULL,0 src content, in which case, it will write a final empty block to end the frame, - Without last block mark, frames will be considered unfinished (broken) by decoders. + It's possible to use srcSize==0, in which case, it will write a final empty block to end the frame. + Without last block mark, frames will be considered unfinished (corrupted) by decoders. - You can then reuse `ZSTD_CCtx` (ZSTD_compressBegin()) to compress some new frame. + `ZSTD_CCtx` object can be re-used (ZSTD_compressBegin()) to compress some new frame. */ /*===== Buffer-less streaming compression functions =====*/ @@ -553,6 +580,7 @@ ZSTDLIB_API size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel); ZSTDLIB_API size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pledgedSrcSize); ZSTDLIB_API size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx, unsigned long long pledgedSrcSize); +ZSTDLIB_API size_t ZSTD_compressBegin_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict, unsigned long long pledgedSrcSize); ZSTDLIB_API size_t ZSTD_compressContinue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); diff -r ed5b25874d99 -r 4baf79a77afa contrib/python-zstandard/zstd_cffi.py --- a/contrib/python-zstandard/zstd_cffi.py Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/python-zstandard/zstd_cffi.py Fri Mar 24 08:37:26 2017 -0700 @@ -8,145 +8,1035 @@ from __future__ import absolute_import, unicode_literals -import io +import sys from _zstd_cffi import ( ffi, lib, ) +if sys.version_info[0] == 2: + bytes_type = str + int_type = long +else: + bytes_type = bytes + int_type = int -_CSTREAM_IN_SIZE = lib.ZSTD_CStreamInSize() -_CSTREAM_OUT_SIZE = lib.ZSTD_CStreamOutSize() + +COMPRESSION_RECOMMENDED_INPUT_SIZE = lib.ZSTD_CStreamInSize() +COMPRESSION_RECOMMENDED_OUTPUT_SIZE = lib.ZSTD_CStreamOutSize() +DECOMPRESSION_RECOMMENDED_INPUT_SIZE = lib.ZSTD_DStreamInSize() +DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE = lib.ZSTD_DStreamOutSize() + +new_nonzero = ffi.new_allocator(should_clear_after_alloc=False) + + +MAX_COMPRESSION_LEVEL = lib.ZSTD_maxCLevel() +MAGIC_NUMBER = lib.ZSTD_MAGICNUMBER +FRAME_HEADER = b'\x28\xb5\x2f\xfd' +ZSTD_VERSION = (lib.ZSTD_VERSION_MAJOR, lib.ZSTD_VERSION_MINOR, lib.ZSTD_VERSION_RELEASE) + +WINDOWLOG_MIN = lib.ZSTD_WINDOWLOG_MIN +WINDOWLOG_MAX = lib.ZSTD_WINDOWLOG_MAX +CHAINLOG_MIN = lib.ZSTD_CHAINLOG_MIN +CHAINLOG_MAX = lib.ZSTD_CHAINLOG_MAX +HASHLOG_MIN = lib.ZSTD_HASHLOG_MIN +HASHLOG_MAX = lib.ZSTD_HASHLOG_MAX +HASHLOG3_MAX = lib.ZSTD_HASHLOG3_MAX +SEARCHLOG_MIN = lib.ZSTD_SEARCHLOG_MIN +SEARCHLOG_MAX = lib.ZSTD_SEARCHLOG_MAX +SEARCHLENGTH_MIN = lib.ZSTD_SEARCHLENGTH_MIN +SEARCHLENGTH_MAX = lib.ZSTD_SEARCHLENGTH_MAX +TARGETLENGTH_MIN = lib.ZSTD_TARGETLENGTH_MIN +TARGETLENGTH_MAX = lib.ZSTD_TARGETLENGTH_MAX + +STRATEGY_FAST = lib.ZSTD_fast +STRATEGY_DFAST = lib.ZSTD_dfast +STRATEGY_GREEDY = lib.ZSTD_greedy +STRATEGY_LAZY = lib.ZSTD_lazy +STRATEGY_LAZY2 = lib.ZSTD_lazy2 +STRATEGY_BTLAZY2 = lib.ZSTD_btlazy2 +STRATEGY_BTOPT = lib.ZSTD_btopt + +COMPRESSOBJ_FLUSH_FINISH = 0 +COMPRESSOBJ_FLUSH_BLOCK = 1 + + +class ZstdError(Exception): + pass -class _ZstdCompressionWriter(object): - def __init__(self, cstream, writer): - self._cstream = cstream +class CompressionParameters(object): + def __init__(self, window_log, chain_log, hash_log, search_log, + search_length, target_length, strategy): + if window_log < WINDOWLOG_MIN or window_log > WINDOWLOG_MAX: + raise ValueError('invalid window log value') + + if chain_log < CHAINLOG_MIN or chain_log > CHAINLOG_MAX: + raise ValueError('invalid chain log value') + + if hash_log < HASHLOG_MIN or hash_log > HASHLOG_MAX: + raise ValueError('invalid hash log value') + + if search_log < SEARCHLOG_MIN or search_log > SEARCHLOG_MAX: + raise ValueError('invalid search log value') + + if search_length < SEARCHLENGTH_MIN or search_length > SEARCHLENGTH_MAX: + raise ValueError('invalid search length value') + + if target_length < TARGETLENGTH_MIN or target_length > TARGETLENGTH_MAX: + raise ValueError('invalid target length value') + + if strategy < STRATEGY_FAST or strategy > STRATEGY_BTOPT: + raise ValueError('invalid strategy value') + + self.window_log = window_log + self.chain_log = chain_log + self.hash_log = hash_log + self.search_log = search_log + self.search_length = search_length + self.target_length = target_length + self.strategy = strategy + + def as_compression_parameters(self): + p = ffi.new('ZSTD_compressionParameters *')[0] + p.windowLog = self.window_log + p.chainLog = self.chain_log + p.hashLog = self.hash_log + p.searchLog = self.search_log + p.searchLength = self.search_length + p.targetLength = self.target_length + p.strategy = self.strategy + + return p + +def get_compression_parameters(level, source_size=0, dict_size=0): + params = lib.ZSTD_getCParams(level, source_size, dict_size) + return CompressionParameters(window_log=params.windowLog, + chain_log=params.chainLog, + hash_log=params.hashLog, + search_log=params.searchLog, + search_length=params.searchLength, + target_length=params.targetLength, + strategy=params.strategy) + + +def estimate_compression_context_size(params): + if not isinstance(params, CompressionParameters): + raise ValueError('argument must be a CompressionParameters') + + cparams = params.as_compression_parameters() + return lib.ZSTD_estimateCCtxSize(cparams) + + +def estimate_decompression_context_size(): + return lib.ZSTD_estimateDCtxSize() + + +class ZstdCompressionWriter(object): + def __init__(self, compressor, writer, source_size, write_size): + self._compressor = compressor self._writer = writer + self._source_size = source_size + self._write_size = write_size + self._entered = False def __enter__(self): + if self._entered: + raise ZstdError('cannot __enter__ multiple times') + + self._cstream = self._compressor._get_cstream(self._source_size) + self._entered = True return self def __exit__(self, exc_type, exc_value, exc_tb): + self._entered = False + if not exc_type and not exc_value and not exc_tb: out_buffer = ffi.new('ZSTD_outBuffer *') - out_buffer.dst = ffi.new('char[]', _CSTREAM_OUT_SIZE) - out_buffer.size = _CSTREAM_OUT_SIZE + dst_buffer = ffi.new('char[]', self._write_size) + out_buffer.dst = dst_buffer + out_buffer.size = self._write_size out_buffer.pos = 0 while True: - res = lib.ZSTD_endStream(self._cstream, out_buffer) - if lib.ZSTD_isError(res): - raise Exception('error ending compression stream: %s' % lib.ZSTD_getErrorName) + zresult = lib.ZSTD_endStream(self._cstream, out_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('error ending compression stream: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) if out_buffer.pos: - self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)) + self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)[:]) out_buffer.pos = 0 - if res == 0: + if zresult == 0: break + self._cstream = None + self._compressor = None + return False + def memory_size(self): + if not self._entered: + raise ZstdError('cannot determine size of an inactive compressor; ' + 'call when a context manager is active') + + return lib.ZSTD_sizeof_CStream(self._cstream) + def write(self, data): + if not self._entered: + raise ZstdError('write() must be called from an active context ' + 'manager') + + total_write = 0 + + data_buffer = ffi.from_buffer(data) + + in_buffer = ffi.new('ZSTD_inBuffer *') + in_buffer.src = data_buffer + in_buffer.size = len(data_buffer) + in_buffer.pos = 0 + out_buffer = ffi.new('ZSTD_outBuffer *') - out_buffer.dst = ffi.new('char[]', _CSTREAM_OUT_SIZE) - out_buffer.size = _CSTREAM_OUT_SIZE + dst_buffer = ffi.new('char[]', self._write_size) + out_buffer.dst = dst_buffer + out_buffer.size = self._write_size + out_buffer.pos = 0 + + while in_buffer.pos < in_buffer.size: + zresult = lib.ZSTD_compressStream(self._cstream, out_buffer, in_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd compress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if out_buffer.pos: + self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)[:]) + total_write += out_buffer.pos + out_buffer.pos = 0 + + return total_write + + def flush(self): + if not self._entered: + raise ZstdError('flush must be called from an active context manager') + + total_write = 0 + + out_buffer = ffi.new('ZSTD_outBuffer *') + dst_buffer = ffi.new('char[]', self._write_size) + out_buffer.dst = dst_buffer + out_buffer.size = self._write_size out_buffer.pos = 0 - # TODO can we reuse existing memory? - in_buffer = ffi.new('ZSTD_inBuffer *') - in_buffer.src = ffi.new('char[]', data) - in_buffer.size = len(data) - in_buffer.pos = 0 - while in_buffer.pos < in_buffer.size: - res = lib.ZSTD_compressStream(self._cstream, out_buffer, in_buffer) - if lib.ZSTD_isError(res): - raise Exception('zstd compress error: %s' % lib.ZSTD_getErrorName(res)) + while True: + zresult = lib.ZSTD_flushStream(self._cstream, out_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd compress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if not out_buffer.pos: + break + + self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)[:]) + total_write += out_buffer.pos + out_buffer.pos = 0 + + return total_write + + +class ZstdCompressionObj(object): + def compress(self, data): + if self._finished: + raise ZstdError('cannot call compress() after compressor finished') + + data_buffer = ffi.from_buffer(data) + source = ffi.new('ZSTD_inBuffer *') + source.src = data_buffer + source.size = len(data_buffer) + source.pos = 0 + + chunks = [] + + while source.pos < len(data): + zresult = lib.ZSTD_compressStream(self._cstream, self._out, source) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd compress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if self._out.pos: + chunks.append(ffi.buffer(self._out.dst, self._out.pos)[:]) + self._out.pos = 0 + + return b''.join(chunks) - if out_buffer.pos: - self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)) - out_buffer.pos = 0 + def flush(self, flush_mode=COMPRESSOBJ_FLUSH_FINISH): + if flush_mode not in (COMPRESSOBJ_FLUSH_FINISH, COMPRESSOBJ_FLUSH_BLOCK): + raise ValueError('flush mode not recognized') + + if self._finished: + raise ZstdError('compressor object already finished') + + assert self._out.pos == 0 + + if flush_mode == COMPRESSOBJ_FLUSH_BLOCK: + zresult = lib.ZSTD_flushStream(self._cstream, self._out) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd compress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + # Output buffer is guaranteed to hold full block. + assert zresult == 0 + + if self._out.pos: + result = ffi.buffer(self._out.dst, self._out.pos)[:] + self._out.pos = 0 + return result + else: + return b'' + + assert flush_mode == COMPRESSOBJ_FLUSH_FINISH + self._finished = True + + chunks = [] + + while True: + zresult = lib.ZSTD_endStream(self._cstream, self._out) + if lib.ZSTD_isError(zresult): + raise ZstdError('error ending compression stream: %s' % + ffi.string(lib.ZSTD_getErroName(zresult))) + + if self._out.pos: + chunks.append(ffi.buffer(self._out.dst, self._out.pos)[:]) + self._out.pos = 0 + + if not zresult: + break + + # GC compression stream immediately. + self._cstream = None + + return b''.join(chunks) class ZstdCompressor(object): - def __init__(self, level=3, dict_data=None, compression_params=None): - if dict_data: - raise Exception('dict_data not yet supported') - if compression_params: - raise Exception('compression_params not yet supported') + def __init__(self, level=3, dict_data=None, compression_params=None, + write_checksum=False, write_content_size=False, + write_dict_id=True): + if level < 1: + raise ValueError('level must be greater than 0') + elif level > lib.ZSTD_maxCLevel(): + raise ValueError('level must be less than %d' % lib.ZSTD_maxCLevel()) self._compression_level = level + self._dict_data = dict_data + self._cparams = compression_params + self._fparams = ffi.new('ZSTD_frameParameters *')[0] + self._fparams.checksumFlag = write_checksum + self._fparams.contentSizeFlag = write_content_size + self._fparams.noDictIDFlag = not write_dict_id - def compress(self, data): - # Just use the stream API for now. - output = io.BytesIO() - with self.write_to(output) as compressor: - compressor.write(data) - return output.getvalue() + cctx = lib.ZSTD_createCCtx() + if cctx == ffi.NULL: + raise MemoryError() + + self._cctx = ffi.gc(cctx, lib.ZSTD_freeCCtx) + + def compress(self, data, allow_empty=False): + if len(data) == 0 and self._fparams.contentSizeFlag and not allow_empty: + raise ValueError('cannot write empty inputs when writing content sizes') + + # TODO use a CDict for performance. + dict_data = ffi.NULL + dict_size = 0 + + if self._dict_data: + dict_data = self._dict_data.as_bytes() + dict_size = len(self._dict_data) + + params = ffi.new('ZSTD_parameters *')[0] + if self._cparams: + params.cParams = self._cparams.as_compression_parameters() + else: + params.cParams = lib.ZSTD_getCParams(self._compression_level, len(data), + dict_size) + params.fParams = self._fparams + + dest_size = lib.ZSTD_compressBound(len(data)) + out = new_nonzero('char[]', dest_size) - def copy_stream(self, ifh, ofh): - cstream = self._get_cstream() + zresult = lib.ZSTD_compress_advanced(self._cctx, + ffi.addressof(out), dest_size, + data, len(data), + dict_data, dict_size, + params) + + if lib.ZSTD_isError(zresult): + raise ZstdError('cannot compress: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + return ffi.buffer(out, zresult)[:] + + def compressobj(self, size=0): + cstream = self._get_cstream(size) + cobj = ZstdCompressionObj() + cobj._cstream = cstream + cobj._out = ffi.new('ZSTD_outBuffer *') + cobj._dst_buffer = ffi.new('char[]', COMPRESSION_RECOMMENDED_OUTPUT_SIZE) + cobj._out.dst = cobj._dst_buffer + cobj._out.size = COMPRESSION_RECOMMENDED_OUTPUT_SIZE + cobj._out.pos = 0 + cobj._compressor = self + cobj._finished = False + + return cobj + + def copy_stream(self, ifh, ofh, size=0, + read_size=COMPRESSION_RECOMMENDED_INPUT_SIZE, + write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE): + + if not hasattr(ifh, 'read'): + raise ValueError('first argument must have a read() method') + if not hasattr(ofh, 'write'): + raise ValueError('second argument must have a write() method') + + cstream = self._get_cstream(size) in_buffer = ffi.new('ZSTD_inBuffer *') out_buffer = ffi.new('ZSTD_outBuffer *') - out_buffer.dst = ffi.new('char[]', _CSTREAM_OUT_SIZE) - out_buffer.size = _CSTREAM_OUT_SIZE + dst_buffer = ffi.new('char[]', write_size) + out_buffer.dst = dst_buffer + out_buffer.size = write_size out_buffer.pos = 0 total_read, total_write = 0, 0 while True: - data = ifh.read(_CSTREAM_IN_SIZE) + data = ifh.read(read_size) if not data: break - total_read += len(data) - - in_buffer.src = ffi.new('char[]', data) - in_buffer.size = len(data) + data_buffer = ffi.from_buffer(data) + total_read += len(data_buffer) + in_buffer.src = data_buffer + in_buffer.size = len(data_buffer) in_buffer.pos = 0 while in_buffer.pos < in_buffer.size: - res = lib.ZSTD_compressStream(cstream, out_buffer, in_buffer) - if lib.ZSTD_isError(res): - raise Exception('zstd compress error: %s' % - lib.ZSTD_getErrorName(res)) + zresult = lib.ZSTD_compressStream(cstream, out_buffer, in_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd compress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) if out_buffer.pos: ofh.write(ffi.buffer(out_buffer.dst, out_buffer.pos)) - total_write = out_buffer.pos + total_write += out_buffer.pos out_buffer.pos = 0 # We've finished reading. Flush the compressor. while True: - res = lib.ZSTD_endStream(cstream, out_buffer) - if lib.ZSTD_isError(res): - raise Exception('error ending compression stream: %s' % - lib.ZSTD_getErrorName(res)) + zresult = lib.ZSTD_endStream(cstream, out_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('error ending compression stream: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) if out_buffer.pos: ofh.write(ffi.buffer(out_buffer.dst, out_buffer.pos)) total_write += out_buffer.pos out_buffer.pos = 0 - if res == 0: + if zresult == 0: break return total_read, total_write - def write_to(self, writer): - return _ZstdCompressionWriter(self._get_cstream(), writer) + def write_to(self, writer, size=0, + write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE): + + if not hasattr(writer, 'write'): + raise ValueError('must pass an object with a write() method') + + return ZstdCompressionWriter(self, writer, size, write_size) + + def read_from(self, reader, size=0, + read_size=COMPRESSION_RECOMMENDED_INPUT_SIZE, + write_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE): + if hasattr(reader, 'read'): + have_read = True + elif hasattr(reader, '__getitem__'): + have_read = False + buffer_offset = 0 + size = len(reader) + else: + raise ValueError('must pass an object with a read() method or ' + 'conforms to buffer protocol') + + cstream = self._get_cstream(size) + + in_buffer = ffi.new('ZSTD_inBuffer *') + out_buffer = ffi.new('ZSTD_outBuffer *') + + in_buffer.src = ffi.NULL + in_buffer.size = 0 + in_buffer.pos = 0 + + dst_buffer = ffi.new('char[]', write_size) + out_buffer.dst = dst_buffer + out_buffer.size = write_size + out_buffer.pos = 0 + + while True: + # We should never have output data sitting around after a previous + # iteration. + assert out_buffer.pos == 0 + + # Collect input data. + if have_read: + read_result = reader.read(read_size) + else: + remaining = len(reader) - buffer_offset + slice_size = min(remaining, read_size) + read_result = reader[buffer_offset:buffer_offset + slice_size] + buffer_offset += slice_size - def _get_cstream(self): + # No new input data. Break out of the read loop. + if not read_result: + break + + # Feed all read data into the compressor and emit output until + # exhausted. + read_buffer = ffi.from_buffer(read_result) + in_buffer.src = read_buffer + in_buffer.size = len(read_buffer) + in_buffer.pos = 0 + + while in_buffer.pos < in_buffer.size: + zresult = lib.ZSTD_compressStream(cstream, out_buffer, in_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd compress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if out_buffer.pos: + data = ffi.buffer(out_buffer.dst, out_buffer.pos)[:] + out_buffer.pos = 0 + yield data + + assert out_buffer.pos == 0 + + # And repeat the loop to collect more data. + continue + + # If we get here, input is exhausted. End the stream and emit what + # remains. + while True: + assert out_buffer.pos == 0 + zresult = lib.ZSTD_endStream(cstream, out_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('error ending compression stream: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if out_buffer.pos: + data = ffi.buffer(out_buffer.dst, out_buffer.pos)[:] + out_buffer.pos = 0 + yield data + + if zresult == 0: + break + + def _get_cstream(self, size): cstream = lib.ZSTD_createCStream() + if cstream == ffi.NULL: + raise MemoryError() + cstream = ffi.gc(cstream, lib.ZSTD_freeCStream) - res = lib.ZSTD_initCStream(cstream, self._compression_level) - if lib.ZSTD_isError(res): + dict_data = ffi.NULL + dict_size = 0 + if self._dict_data: + dict_data = self._dict_data.as_bytes() + dict_size = len(self._dict_data) + + zparams = ffi.new('ZSTD_parameters *')[0] + if self._cparams: + zparams.cParams = self._cparams.as_compression_parameters() + else: + zparams.cParams = lib.ZSTD_getCParams(self._compression_level, + size, dict_size) + zparams.fParams = self._fparams + + zresult = lib.ZSTD_initCStream_advanced(cstream, dict_data, dict_size, + zparams, size) + if lib.ZSTD_isError(zresult): raise Exception('cannot init CStream: %s' % - lib.ZSTD_getErrorName(res)) + ffi.string(lib.ZSTD_getErrorName(zresult))) return cstream + + +class FrameParameters(object): + def __init__(self, fparams): + self.content_size = fparams.frameContentSize + self.window_size = fparams.windowSize + self.dict_id = fparams.dictID + self.has_checksum = bool(fparams.checksumFlag) + + +def get_frame_parameters(data): + if not isinstance(data, bytes_type): + raise TypeError('argument must be bytes') + + params = ffi.new('ZSTD_frameParams *') + + zresult = lib.ZSTD_getFrameParams(params, data, len(data)) + if lib.ZSTD_isError(zresult): + raise ZstdError('cannot get frame parameters: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if zresult: + raise ZstdError('not enough data for frame parameters; need %d bytes' % + zresult) + + return FrameParameters(params[0]) + + +class ZstdCompressionDict(object): + def __init__(self, data): + assert isinstance(data, bytes_type) + self._data = data + + def __len__(self): + return len(self._data) + + def dict_id(self): + return int_type(lib.ZDICT_getDictID(self._data, len(self._data))) + + def as_bytes(self): + return self._data + + +def train_dictionary(dict_size, samples, parameters=None): + if not isinstance(samples, list): + raise TypeError('samples must be a list') + + total_size = sum(map(len, samples)) + + samples_buffer = new_nonzero('char[]', total_size) + sample_sizes = new_nonzero('size_t[]', len(samples)) + + offset = 0 + for i, sample in enumerate(samples): + if not isinstance(sample, bytes_type): + raise ValueError('samples must be bytes') + + l = len(sample) + ffi.memmove(samples_buffer + offset, sample, l) + offset += l + sample_sizes[i] = l + + dict_data = new_nonzero('char[]', dict_size) + + zresult = lib.ZDICT_trainFromBuffer(ffi.addressof(dict_data), dict_size, + ffi.addressof(samples_buffer), + ffi.addressof(sample_sizes, 0), + len(samples)) + if lib.ZDICT_isError(zresult): + raise ZstdError('Cannot train dict: %s' % + ffi.string(lib.ZDICT_getErrorName(zresult))) + + return ZstdCompressionDict(ffi.buffer(dict_data, zresult)[:]) + + +class ZstdDecompressionObj(object): + def __init__(self, decompressor): + self._decompressor = decompressor + self._dstream = self._decompressor._get_dstream() + self._finished = False + + def decompress(self, data): + if self._finished: + raise ZstdError('cannot use a decompressobj multiple times') + + in_buffer = ffi.new('ZSTD_inBuffer *') + out_buffer = ffi.new('ZSTD_outBuffer *') + + data_buffer = ffi.from_buffer(data) + in_buffer.src = data_buffer + in_buffer.size = len(data_buffer) + in_buffer.pos = 0 + + dst_buffer = ffi.new('char[]', DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE) + out_buffer.dst = dst_buffer + out_buffer.size = len(dst_buffer) + out_buffer.pos = 0 + + chunks = [] + + while in_buffer.pos < in_buffer.size: + zresult = lib.ZSTD_decompressStream(self._dstream, out_buffer, in_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd decompressor error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if zresult == 0: + self._finished = True + self._dstream = None + self._decompressor = None + + if out_buffer.pos: + chunks.append(ffi.buffer(out_buffer.dst, out_buffer.pos)[:]) + out_buffer.pos = 0 + + return b''.join(chunks) + + +class ZstdDecompressionWriter(object): + def __init__(self, decompressor, writer, write_size): + self._decompressor = decompressor + self._writer = writer + self._write_size = write_size + self._dstream = None + self._entered = False + + def __enter__(self): + if self._entered: + raise ZstdError('cannot __enter__ multiple times') + + self._dstream = self._decompressor._get_dstream() + self._entered = True + + return self + + def __exit__(self, exc_type, exc_value, exc_tb): + self._entered = False + self._dstream = None + + def memory_size(self): + if not self._dstream: + raise ZstdError('cannot determine size of inactive decompressor ' + 'call when context manager is active') + + return lib.ZSTD_sizeof_DStream(self._dstream) + + def write(self, data): + if not self._entered: + raise ZstdError('write must be called from an active context manager') + + total_write = 0 + + in_buffer = ffi.new('ZSTD_inBuffer *') + out_buffer = ffi.new('ZSTD_outBuffer *') + + data_buffer = ffi.from_buffer(data) + in_buffer.src = data_buffer + in_buffer.size = len(data_buffer) + in_buffer.pos = 0 + + dst_buffer = ffi.new('char[]', self._write_size) + out_buffer.dst = dst_buffer + out_buffer.size = len(dst_buffer) + out_buffer.pos = 0 + + while in_buffer.pos < in_buffer.size: + zresult = lib.ZSTD_decompressStream(self._dstream, out_buffer, in_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd decompress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if out_buffer.pos: + self._writer.write(ffi.buffer(out_buffer.dst, out_buffer.pos)[:]) + total_write += out_buffer.pos + out_buffer.pos = 0 + + return total_write + + +class ZstdDecompressor(object): + def __init__(self, dict_data=None): + self._dict_data = dict_data + + dctx = lib.ZSTD_createDCtx() + if dctx == ffi.NULL: + raise MemoryError() + + self._refdctx = ffi.gc(dctx, lib.ZSTD_freeDCtx) + + @property + def _ddict(self): + if self._dict_data: + dict_data = self._dict_data.as_bytes() + dict_size = len(self._dict_data) + + ddict = lib.ZSTD_createDDict(dict_data, dict_size) + if ddict == ffi.NULL: + raise ZstdError('could not create decompression dict') + else: + ddict = None + + self.__dict__['_ddict'] = ddict + return ddict + + def decompress(self, data, max_output_size=0): + data_buffer = ffi.from_buffer(data) + + orig_dctx = new_nonzero('char[]', lib.ZSTD_sizeof_DCtx(self._refdctx)) + dctx = ffi.cast('ZSTD_DCtx *', orig_dctx) + lib.ZSTD_copyDCtx(dctx, self._refdctx) + + ddict = self._ddict + + output_size = lib.ZSTD_getDecompressedSize(data_buffer, len(data_buffer)) + if output_size: + result_buffer = ffi.new('char[]', output_size) + result_size = output_size + else: + if not max_output_size: + raise ZstdError('input data invalid or missing content size ' + 'in frame header') + + result_buffer = ffi.new('char[]', max_output_size) + result_size = max_output_size + + if ddict: + zresult = lib.ZSTD_decompress_usingDDict(dctx, + result_buffer, result_size, + data_buffer, len(data_buffer), + ddict) + else: + zresult = lib.ZSTD_decompressDCtx(dctx, + result_buffer, result_size, + data_buffer, len(data_buffer)) + if lib.ZSTD_isError(zresult): + raise ZstdError('decompression error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + elif output_size and zresult != output_size: + raise ZstdError('decompression error: decompressed %d bytes; expected %d' % + (zresult, output_size)) + + return ffi.buffer(result_buffer, zresult)[:] + + def decompressobj(self): + return ZstdDecompressionObj(self) + + def read_from(self, reader, read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE, + write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE, + skip_bytes=0): + if skip_bytes >= read_size: + raise ValueError('skip_bytes must be smaller than read_size') + + if hasattr(reader, 'read'): + have_read = True + elif hasattr(reader, '__getitem__'): + have_read = False + buffer_offset = 0 + size = len(reader) + else: + raise ValueError('must pass an object with a read() method or ' + 'conforms to buffer protocol') + + if skip_bytes: + if have_read: + reader.read(skip_bytes) + else: + if skip_bytes > size: + raise ValueError('skip_bytes larger than first input chunk') + + buffer_offset = skip_bytes + + dstream = self._get_dstream() + + in_buffer = ffi.new('ZSTD_inBuffer *') + out_buffer = ffi.new('ZSTD_outBuffer *') + + dst_buffer = ffi.new('char[]', write_size) + out_buffer.dst = dst_buffer + out_buffer.size = len(dst_buffer) + out_buffer.pos = 0 + + while True: + assert out_buffer.pos == 0 + + if have_read: + read_result = reader.read(read_size) + else: + remaining = size - buffer_offset + slice_size = min(remaining, read_size) + read_result = reader[buffer_offset:buffer_offset + slice_size] + buffer_offset += slice_size + + # No new input. Break out of read loop. + if not read_result: + break + + # Feed all read data into decompressor and emit output until + # exhausted. + read_buffer = ffi.from_buffer(read_result) + in_buffer.src = read_buffer + in_buffer.size = len(read_buffer) + in_buffer.pos = 0 + + while in_buffer.pos < in_buffer.size: + assert out_buffer.pos == 0 + + zresult = lib.ZSTD_decompressStream(dstream, out_buffer, in_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd decompress error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if out_buffer.pos: + data = ffi.buffer(out_buffer.dst, out_buffer.pos)[:] + out_buffer.pos = 0 + yield data + + if zresult == 0: + return + + # Repeat loop to collect more input data. + continue + + # If we get here, input is exhausted. + + def write_to(self, writer, write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE): + if not hasattr(writer, 'write'): + raise ValueError('must pass an object with a write() method') + + return ZstdDecompressionWriter(self, writer, write_size) + + def copy_stream(self, ifh, ofh, + read_size=DECOMPRESSION_RECOMMENDED_INPUT_SIZE, + write_size=DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE): + if not hasattr(ifh, 'read'): + raise ValueError('first argument must have a read() method') + if not hasattr(ofh, 'write'): + raise ValueError('second argument must have a write() method') + + dstream = self._get_dstream() + + in_buffer = ffi.new('ZSTD_inBuffer *') + out_buffer = ffi.new('ZSTD_outBuffer *') + + dst_buffer = ffi.new('char[]', write_size) + out_buffer.dst = dst_buffer + out_buffer.size = write_size + out_buffer.pos = 0 + + total_read, total_write = 0, 0 + + # Read all available input. + while True: + data = ifh.read(read_size) + if not data: + break + + data_buffer = ffi.from_buffer(data) + total_read += len(data_buffer) + in_buffer.src = data_buffer + in_buffer.size = len(data_buffer) + in_buffer.pos = 0 + + # Flush all read data to output. + while in_buffer.pos < in_buffer.size: + zresult = lib.ZSTD_decompressStream(dstream, out_buffer, in_buffer) + if lib.ZSTD_isError(zresult): + raise ZstdError('zstd decompressor error: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + if out_buffer.pos: + ofh.write(ffi.buffer(out_buffer.dst, out_buffer.pos)) + total_write += out_buffer.pos + out_buffer.pos = 0 + + # Continue loop to keep reading. + + return total_read, total_write + + def decompress_content_dict_chain(self, frames): + if not isinstance(frames, list): + raise TypeError('argument must be a list') + + if not frames: + raise ValueError('empty input chain') + + # First chunk should not be using a dictionary. We handle it specially. + chunk = frames[0] + if not isinstance(chunk, bytes_type): + raise ValueError('chunk 0 must be bytes') + + # All chunks should be zstd frames and should have content size set. + chunk_buffer = ffi.from_buffer(chunk) + params = ffi.new('ZSTD_frameParams *') + zresult = lib.ZSTD_getFrameParams(params, chunk_buffer, len(chunk_buffer)) + if lib.ZSTD_isError(zresult): + raise ValueError('chunk 0 is not a valid zstd frame') + elif zresult: + raise ValueError('chunk 0 is too small to contain a zstd frame') + + if not params.frameContentSize: + raise ValueError('chunk 0 missing content size in frame') + + dctx = lib.ZSTD_createDCtx() + if dctx == ffi.NULL: + raise MemoryError() + + dctx = ffi.gc(dctx, lib.ZSTD_freeDCtx) + + last_buffer = ffi.new('char[]', params.frameContentSize) + + zresult = lib.ZSTD_decompressDCtx(dctx, last_buffer, len(last_buffer), + chunk_buffer, len(chunk_buffer)) + if lib.ZSTD_isError(zresult): + raise ZstdError('could not decompress chunk 0: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + # Special case of chain length of 1 + if len(frames) == 1: + return ffi.buffer(last_buffer, len(last_buffer))[:] + + i = 1 + while i < len(frames): + chunk = frames[i] + if not isinstance(chunk, bytes_type): + raise ValueError('chunk %d must be bytes' % i) + + chunk_buffer = ffi.from_buffer(chunk) + zresult = lib.ZSTD_getFrameParams(params, chunk_buffer, len(chunk_buffer)) + if lib.ZSTD_isError(zresult): + raise ValueError('chunk %d is not a valid zstd frame' % i) + elif zresult: + raise ValueError('chunk %d is too small to contain a zstd frame' % i) + + if not params.frameContentSize: + raise ValueError('chunk %d missing content size in frame' % i) + + dest_buffer = ffi.new('char[]', params.frameContentSize) + + zresult = lib.ZSTD_decompress_usingDict(dctx, dest_buffer, len(dest_buffer), + chunk_buffer, len(chunk_buffer), + last_buffer, len(last_buffer)) + if lib.ZSTD_isError(zresult): + raise ZstdError('could not decompress chunk %d' % i) + + last_buffer = dest_buffer + i += 1 + + return ffi.buffer(last_buffer, len(last_buffer))[:] + + def _get_dstream(self): + dstream = lib.ZSTD_createDStream() + if dstream == ffi.NULL: + raise MemoryError() + + dstream = ffi.gc(dstream, lib.ZSTD_freeDStream) + + if self._dict_data: + zresult = lib.ZSTD_initDStream_usingDict(dstream, + self._dict_data.as_bytes(), + len(self._dict_data)) + else: + zresult = lib.ZSTD_initDStream(dstream) + + if lib.ZSTD_isError(zresult): + raise ZstdError('could not initialize DStream: %s' % + ffi.string(lib.ZSTD_getErrorName(zresult))) + + return dstream diff -r ed5b25874d99 -r 4baf79a77afa contrib/undumprevlog --- a/contrib/undumprevlog Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/undumprevlog Fri Mar 24 08:37:26 2017 -0700 @@ -9,15 +9,15 @@ from mercurial import ( node, revlog, - scmutil, transaction, util, + vfs as vfsmod, ) for fp in (sys.stdin, sys.stdout, sys.stderr): util.setbinary(fp) -opener = scmutil.opener('.', False) +opener = vfsmod.vfs('.', False) tr = transaction.transaction(sys.stderr.write, opener, {'store': opener}, "undump.journal") while True: diff -r ed5b25874d99 -r 4baf79a77afa contrib/win32/mercurial.ini --- a/contrib/win32/mercurial.ini Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/win32/mercurial.ini Fri Mar 24 08:37:26 2017 -0700 @@ -19,6 +19,8 @@ editor = notepad ; show changed files and be a bit more verbose if True ; verbose = True +; colorize commands output +; color = auto ; username data to appear in commits ; it usually takes the form: Joe User @@ -40,7 +42,6 @@ ;bugzilla = ;children = ;churn = -;color = ;convert = ;eol = ;extdiff = diff -r ed5b25874d99 -r 4baf79a77afa contrib/wix/help.wxs --- a/contrib/wix/help.wxs Thu Mar 23 19:54:59 2017 -0700 +++ b/contrib/wix/help.wxs Fri Mar 24 08:37:26 2017 -0700 @@ -15,6 +15,7 @@ + @@ -25,6 +26,7 @@ + @@ -37,6 +39,7 @@ + diff -r ed5b25874d99 -r 4baf79a77afa hgext/automv.py --- a/hgext/automv.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/automv.py Fri Mar 24 08:37:26 2017 -0700 @@ -4,7 +4,7 @@ # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. -"""Check for unrecorded moves at commit time (EXPERIMENTAL) +"""check for unrecorded moves at commit time (EXPERIMENTAL) This extension checks at commit/amend time if any of the committed files comes from an unrecorded mv. diff -r ed5b25874d99 -r 4baf79a77afa hgext/bugzilla.py --- a/hgext/bugzilla.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/bugzilla.py Fri Mar 24 08:37:26 2017 -0700 @@ -15,14 +15,16 @@ The bug references can optionally include an update for Bugzilla of the hours spent working on the bug. Bugs can also be marked fixed. -Three basic modes of access to Bugzilla are provided: +Four basic modes of access to Bugzilla are provided: + +1. Access via the Bugzilla REST-API. Requires bugzilla 5.0 or later. -1. Access via the Bugzilla XMLRPC interface. Requires Bugzilla 3.4 or later. +2. Access via the Bugzilla XMLRPC interface. Requires Bugzilla 3.4 or later. -2. Check data via the Bugzilla XMLRPC interface and submit bug change +3. Check data via the Bugzilla XMLRPC interface and submit bug change via email to Bugzilla email interface. Requires Bugzilla 3.4 or later. -3. Writing directly to the Bugzilla database. Only Bugzilla installations +4. Writing directly to the Bugzilla database. Only Bugzilla installations using MySQL are supported. Requires Python MySQLdb. Writing directly to the database is susceptible to schema changes, and @@ -50,11 +52,16 @@ Bugzilla is used instead as the source of the comment. Marking bugs fixed works on all supported Bugzilla versions. +Access via the REST-API needs either a Bugzilla username and password +or an apikey specified in the configuration. Comments are made under +the given username or the user assoicated with the apikey in Bugzilla. + Configuration items common to all access modes: bugzilla.version The access type to use. Values recognized are: + :``restapi``: Bugzilla REST-API, Bugzilla 5.0 and later. :``xmlrpc``: Bugzilla XMLRPC interface. :``xmlrpc+email``: Bugzilla XMLRPC and email interfaces. :``3.0``: MySQL access, Bugzilla 3.0 and later. @@ -135,7 +142,7 @@ committer email to Bugzilla user email. See also ``bugzilla.usermap``. Contains entries of the form ``committer = Bugzilla user``. -XMLRPC access mode configuration: +XMLRPC and REST-API access mode configuration: bugzilla.bzurl The base URL for the Bugzilla installation. @@ -148,6 +155,13 @@ bugzilla.password The password for Bugzilla login. +REST-API access mode uses the options listed above as well as: + +bugzilla.apikey + An apikey generated on the Bugzilla instance for api access. + Using an apikey removes the need to store the user and password + options. + XMLRPC+email access mode uses the XMLRPC access mode configuration items, and also: @@ -279,6 +293,7 @@ from __future__ import absolute_import +import json import re import time @@ -288,10 +303,10 @@ cmdutil, error, mail, + url, util, ) -urlparse = util.urlparse xmlrpclib = util.xmlrpclib # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for @@ -641,7 +656,7 @@ self.bztoken = login.get('token', '') def transport(self, uri): - if urlparse.urlparse(uri, "http")[0] == "https": + if util.urlreq.urlparse(uri, "http")[0] == "https": return cookiesafetransport() else: return cookietransport() @@ -773,6 +788,136 @@ cmds.append(self.makecommandline("resolution", self.fixresolution)) self.send_bug_modify_email(bugid, cmds, text, committer) +class NotFound(LookupError): + pass + +class bzrestapi(bzaccess): + """Read and write bugzilla data using the REST API available since + Bugzilla 5.0. + """ + def __init__(self, ui): + bzaccess.__init__(self, ui) + bz = self.ui.config('bugzilla', 'bzurl', + 'http://localhost/bugzilla/') + self.bzroot = '/'.join([bz, 'rest']) + self.apikey = self.ui.config('bugzilla', 'apikey', '') + self.user = self.ui.config('bugzilla', 'user', 'bugs') + self.passwd = self.ui.config('bugzilla', 'password') + self.fixstatus = self.ui.config('bugzilla', 'fixstatus', 'RESOLVED') + self.fixresolution = self.ui.config('bugzilla', 'fixresolution', + 'FIXED') + + def apiurl(self, targets, include_fields=None): + url = '/'.join([self.bzroot] + [str(t) for t in targets]) + qv = {} + if self.apikey: + qv['api_key'] = self.apikey + elif self.user and self.passwd: + qv['login'] = self.user + qv['password'] = self.passwd + if include_fields: + qv['include_fields'] = include_fields + if qv: + url = '%s?%s' % (url, util.urlreq.urlencode(qv)) + return url + + def _fetch(self, burl): + try: + resp = url.open(self.ui, burl) + return json.loads(resp.read()) + except util.urlerr.httperror as inst: + if inst.code == 401: + raise error.Abort(_('authorization failed')) + if inst.code == 404: + raise NotFound() + else: + raise + + def _submit(self, burl, data, method='POST'): + data = json.dumps(data) + if method == 'PUT': + class putrequest(util.urlreq.request): + def get_method(self): + return 'PUT' + request_type = putrequest + else: + request_type = util.urlreq.request + req = request_type(burl, data, + {'Content-Type': 'application/json'}) + try: + resp = url.opener(self.ui).open(req) + return json.loads(resp.read()) + except util.urlerr.httperror as inst: + if inst.code == 401: + raise error.Abort(_('authorization failed')) + if inst.code == 404: + raise NotFound() + else: + raise + + def filter_real_bug_ids(self, bugs): + '''remove bug IDs that do not exist in Bugzilla from bugs.''' + badbugs = set() + for bugid in bugs: + burl = self.apiurl(('bug', bugid), include_fields='status') + try: + self._fetch(burl) + except NotFound: + badbugs.add(bugid) + for bugid in badbugs: + del bugs[bugid] + + def filter_cset_known_bug_ids(self, node, bugs): + '''remove bug IDs where node occurs in comment text from bugs.''' + sn = short(node) + for bugid in bugs.keys(): + burl = self.apiurl(('bug', bugid, 'comment'), include_fields='text') + result = self._fetch(burl) + comments = result['bugs'][str(bugid)]['comments'] + if any(sn in c['text'] for c in comments): + self.ui.status(_('bug %d already knows about changeset %s\n') % + (bugid, sn)) + del bugs[bugid] + + def updatebug(self, bugid, newstate, text, committer): + '''update the specified bug. Add comment text and set new states. + + If possible add the comment as being from the committer of + the changeset. Otherwise use the default Bugzilla user. + ''' + bugmod = {} + if 'hours' in newstate: + bugmod['work_time'] = newstate['hours'] + if 'fix' in newstate: + bugmod['status'] = self.fixstatus + bugmod['resolution'] = self.fixresolution + if bugmod: + # if we have to change the bugs state do it here + bugmod['comment'] = { + 'comment': text, + 'is_private': False, + 'is_markdown': False, + } + burl = self.apiurl(('bug', bugid)) + self._submit(burl, bugmod, method='PUT') + self.ui.debug('updated bug %s\n' % bugid) + else: + burl = self.apiurl(('bug', bugid, 'comment')) + self._submit(burl, { + 'comment': text, + 'is_private': False, + 'is_markdown': False, + }) + self.ui.debug('added comment to bug %s\n' % bugid) + + def notify(self, bugs, committer): + '''Force sending of Bugzilla notification emails. + + Only required if the access method does not trigger notification + emails automatically. + ''' + pass + class bugzilla(object): # supported versions of bugzilla. different versions have # different schemas. @@ -781,7 +926,8 @@ '2.18': bzmysql_2_18, '3.0': bzmysql_3_0, 'xmlrpc': bzxmlrpc, - 'xmlrpc+email': bzxmlrpcemail + 'xmlrpc+email': bzxmlrpcemail, + 'restapi': bzrestapi, } _default_bug_re = (r'bugs?\s*,?\s*(?:#|nos?\.?|num(?:ber)?s?)?\s*' diff -r ed5b25874d99 -r 4baf79a77afa hgext/clonebundles.py --- a/hgext/clonebundles.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/clonebundles.py Fri Mar 24 08:37:26 2017 -0700 @@ -177,7 +177,7 @@ # Only advertise if a manifest exists. This does add some I/O to requests. # But this should be cheaper than a wasted network round trip due to # missing file. - if repo.opener.exists('clonebundles.manifest'): + if repo.vfs.exists('clonebundles.manifest'): caps.append('clonebundles') return caps diff -r ed5b25874d99 -r 4baf79a77afa hgext/color.py --- a/hgext/color.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/color.py Fri Mar 24 08:37:26 2017 -0700 @@ -5,652 +5,27 @@ # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. -'''colorize output from some commands - -The color extension colorizes output from several Mercurial commands. -For example, the diff command shows additions in green and deletions -in red, while the status command shows modified files in magenta. Many -other commands have analogous colors. It is possible to customize -these colors. - -Effects -------- - -Other effects in addition to color, like bold and underlined text, are -also available. By default, the terminfo database is used to find the -terminal codes used to change color and effect. If terminfo is not -available, then effects are rendered with the ECMA-48 SGR control -function (aka ANSI escape codes). - -The available effects in terminfo mode are 'blink', 'bold', 'dim', -'inverse', 'invisible', 'italic', 'standout', and 'underline'; in -ECMA-48 mode, the options are 'bold', 'inverse', 'italic', and -'underline'. How each is rendered depends on the terminal emulator. -Some may not be available for a given terminal type, and will be -silently ignored. - -If the terminfo entry for your terminal is missing codes for an effect -or has the wrong codes, you can add or override those codes in your -configuration:: - - [color] - terminfo.dim = \E[2m - -where '\E' is substituted with an escape character. +'''enable Mercurial color mode (DEPRECATED) -Labels ------- - -Text receives color effects depending on the labels that it has. Many -default Mercurial commands emit labelled text. You can also define -your own labels in templates using the label function, see :hg:`help -templates`. A single portion of text may have more than one label. In -that case, effects given to the last label will override any other -effects. This includes the special "none" effect, which nullifies -other effects. - -Labels are normally invisible. In order to see these labels and their -position in the text, use the global --color=debug option. The same -anchor text may be associated to multiple labels, e.g. - - [log.changeset changeset.secret|changeset: 22611:6f0a53c8f587] - -The following are the default effects for some default labels. Default -effects may be overridden from your configuration file:: - - [color] - status.modified = blue bold underline red_background - status.added = green bold - status.removed = red bold blue_background - status.deleted = cyan bold underline - status.unknown = magenta bold underline - status.ignored = black bold - - # 'none' turns off all effects - status.clean = none - status.copied = none - - qseries.applied = blue bold underline - qseries.unapplied = black bold - qseries.missing = red bold +This extensions enable Mercurial color mode. The feature is now directly +available in Mercurial core. You can access it using:: - diff.diffline = bold - diff.extended = cyan bold - diff.file_a = red bold - diff.file_b = green bold - diff.hunk = magenta - diff.deleted = red - diff.inserted = green - diff.changed = white - diff.tab = - diff.trailingwhitespace = bold red_background - - # Blank so it inherits the style of the surrounding label - changeset.public = - changeset.draft = - changeset.secret = - - resolve.unresolved = red bold - resolve.resolved = green bold - - bookmarks.active = green - - branches.active = none - branches.closed = black bold - branches.current = green - branches.inactive = none - - tags.normal = green - tags.local = black bold - - rebase.rebased = blue - rebase.remaining = red bold - - shelve.age = cyan - shelve.newest = green bold - shelve.name = blue bold - - histedit.remaining = red bold - -Custom colors -------------- + [ui] + color = auto -Because there are only eight standard colors, this module allows you -to define color names for other color slots which might be available -for your terminal type, assuming terminfo mode. For instance:: - - color.brightblue = 12 - color.pink = 207 - color.orange = 202 - -to set 'brightblue' to color slot 12 (useful for 16 color terminals -that have brighter colors defined in the upper eight) and, 'pink' and -'orange' to colors in 256-color xterm's default color cube. These -defined colors may then be used as any of the pre-defined eight, -including appending '_background' to set the background to that color. - -Modes ------ - -By default, the color extension will use ANSI mode (or win32 mode on -Windows) if it detects a terminal. To override auto mode (to enable -terminfo mode, for example), set the following configuration option:: - - [color] - mode = terminfo - -Any value other than 'ansi', 'win32', 'terminfo', or 'auto' will -disable color. - -Note that on some systems, terminfo mode may cause problems when using -color with the pager extension and less -R. less with the -R option -will only display ECMA-48 color codes, and terminfo mode may sometimes -emit codes that less doesn't understand. You can work around this by -either using ansi mode (or auto mode), or by using less -r (which will -pass through all terminal control codes, not just color control -codes). - -On some systems (such as MSYS in Windows), the terminal may support -a different color mode than the pager (activated via the "pager" -extension). It is possible to define separate modes depending on whether -the pager is active:: - - [color] - mode = auto - pagermode = ansi - -If ``pagermode`` is not defined, the ``mode`` will be used. +See :hg:`help color` for details. ''' from __future__ import absolute_import -from mercurial.i18n import _ -from mercurial import ( - cmdutil, - color, - commands, - dispatch, - encoding, - extensions, - pycompat, - subrepo, - ui as uimod, - util, -) +from mercurial import color -cmdtable = {} -command = cmdutil.command(cmdtable) # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should # be specifying the version(s) of Mercurial they are tested with, or # leave the attribute unspecified. testedwith = 'ships-with-hg-core' -# start and stop parameters for effects -_effects = {'none': 0, 'black': 30, 'red': 31, 'green': 32, 'yellow': 33, - 'blue': 34, 'magenta': 35, 'cyan': 36, 'white': 37, 'bold': 1, - 'italic': 3, 'underline': 4, 'inverse': 7, 'dim': 2, - 'black_background': 40, 'red_background': 41, - 'green_background': 42, 'yellow_background': 43, - 'blue_background': 44, 'purple_background': 45, - 'cyan_background': 46, 'white_background': 47} - -def _terminfosetup(ui, mode): - '''Initialize terminfo data and the terminal if we're in terminfo mode.''' - - # If we failed to load curses, we go ahead and return. - if not _terminfo_params: - return - # Otherwise, see what the config file says. - if mode not in ('auto', 'terminfo'): - return - - _terminfo_params.update((key[6:], (False, int(val), '')) - for key, val in ui.configitems('color') - if key.startswith('color.')) - _terminfo_params.update((key[9:], (True, '', val.replace('\\E', '\x1b'))) - for key, val in ui.configitems('color') - if key.startswith('terminfo.')) - - try: - curses.setupterm() - except curses.error as e: - _terminfo_params.clear() - return - - for key, (b, e, c) in _terminfo_params.items(): - if not b: - continue - if not c and not curses.tigetstr(e): - # Most terminals don't support dim, invis, etc, so don't be - # noisy and use ui.debug(). - ui.debug("no terminfo entry for %s\n" % e) - del _terminfo_params[key] - if not curses.tigetstr('setaf') or not curses.tigetstr('setab'): - # Only warn about missing terminfo entries if we explicitly asked for - # terminfo mode. - if mode == "terminfo": - ui.warn(_("no terminfo entry for setab/setaf: reverting to " - "ECMA-48 color\n")) - _terminfo_params.clear() - -def _modesetup(ui, coloropt): - if coloropt == 'debug': - return 'debug' - - auto = (coloropt == 'auto') - always = not auto and util.parsebool(coloropt) - if not always and not auto: - return None - - formatted = (always or (encoding.environ.get('TERM') != 'dumb' - and ui.formatted())) - - mode = ui.config('color', 'mode', 'auto') - - # If pager is active, color.pagermode overrides color.mode. - if getattr(ui, 'pageractive', False): - mode = ui.config('color', 'pagermode', mode) - - realmode = mode - if mode == 'auto': - if pycompat.osname == 'nt': - term = encoding.environ.get('TERM') - # TERM won't be defined in a vanilla cmd.exe environment. - - # UNIX-like environments on Windows such as Cygwin and MSYS will - # set TERM. They appear to make a best effort attempt at setting it - # to something appropriate. However, not all environments with TERM - # defined support ANSI. Since "ansi" could result in terminal - # gibberish, we error on the side of selecting "win32". However, if - # w32effects is not defined, we almost certainly don't support - # "win32", so don't even try. - if (term and 'xterm' in term) or not w32effects: - realmode = 'ansi' - else: - realmode = 'win32' - else: - realmode = 'ansi' - - def modewarn(): - # only warn if color.mode was explicitly set and we're in - # a formatted terminal - if mode == realmode and ui.formatted(): - ui.warn(_('warning: failed to set color mode to %s\n') % mode) - - if realmode == 'win32': - _terminfo_params.clear() - if not w32effects: - modewarn() - return None - _effects.update(w32effects) - elif realmode == 'ansi': - _terminfo_params.clear() - elif realmode == 'terminfo': - _terminfosetup(ui, mode) - if not _terminfo_params: - ## FIXME Shouldn't we return None in this case too? - modewarn() - realmode = 'ansi' - else: - return None - - if always or (auto and formatted): - return realmode - return None - -try: - import curses - # Mapping from effect name to terminfo attribute name (or raw code) or - # color number. This will also force-load the curses module. - _terminfo_params = {'none': (True, 'sgr0', ''), - 'standout': (True, 'smso', ''), - 'underline': (True, 'smul', ''), - 'reverse': (True, 'rev', ''), - 'inverse': (True, 'rev', ''), - 'blink': (True, 'blink', ''), - 'dim': (True, 'dim', ''), - 'bold': (True, 'bold', ''), - 'invisible': (True, 'invis', ''), - 'italic': (True, 'sitm', ''), - 'black': (False, curses.COLOR_BLACK, ''), - 'red': (False, curses.COLOR_RED, ''), - 'green': (False, curses.COLOR_GREEN, ''), - 'yellow': (False, curses.COLOR_YELLOW, ''), - 'blue': (False, curses.COLOR_BLUE, ''), - 'magenta': (False, curses.COLOR_MAGENTA, ''), - 'cyan': (False, curses.COLOR_CYAN, ''), - 'white': (False, curses.COLOR_WHITE, '')} -except ImportError: - _terminfo_params = {} - -def _effect_str(effect): - '''Helper function for render_effects().''' - - bg = False - if effect.endswith('_background'): - bg = True - effect = effect[:-11] - try: - attr, val, termcode = _terminfo_params[effect] - except KeyError: - return '' - if attr: - if termcode: - return termcode - else: - return curses.tigetstr(val) - elif bg: - return curses.tparm(curses.tigetstr('setab'), val) - else: - return curses.tparm(curses.tigetstr('setaf'), val) - -def render_effects(text, effects): - 'Wrap text in commands to turn on each effect.' - if not text: - return text - if not _terminfo_params: - start = [str(_effects[e]) for e in ['none'] + effects.split()] - start = '\033[' + ';'.join(start) + 'm' - stop = '\033[' + str(_effects['none']) + 'm' - else: - start = ''.join(_effect_str(effect) - for effect in ['none'] + effects.split()) - stop = _effect_str('none') - return ''.join([start, text, stop]) - -def valideffect(effect): - 'Determine if the effect is valid or not.' - good = False - if not _terminfo_params and effect in _effects: - good = True - elif effect in _terminfo_params or effect[:-11] in _terminfo_params: - good = True - return good - -def configstyles(ui): - for status, cfgeffects in ui.configitems('color'): - if '.' not in status or status.startswith(('color.', 'terminfo.')): - continue - cfgeffects = ui.configlist('color', status) - if cfgeffects: - good = [] - for e in cfgeffects: - if valideffect(e): - good.append(e) - else: - ui.warn(_("ignoring unknown color/effect %r " - "(configured in color.%s)\n") - % (e, status)) - color._styles[status] = ' '.join(good) - -class colorui(uimod.ui): - _colormode = 'ansi' - def write(self, *args, **opts): - if self._colormode is None: - return super(colorui, self).write(*args, **opts) - - label = opts.get('label', '') - if self._buffers and not opts.get('prompt', False): - if self._bufferapplylabels: - self._buffers[-1].extend(self.label(a, label) for a in args) - else: - self._buffers[-1].extend(args) - elif self._colormode == 'win32': - for a in args: - win32print(a, super(colorui, self).write, **opts) - else: - return super(colorui, self).write( - *[self.label(a, label) for a in args], **opts) - - def write_err(self, *args, **opts): - if self._colormode is None: - return super(colorui, self).write_err(*args, **opts) - - label = opts.get('label', '') - if self._bufferstates and self._bufferstates[-1][0]: - return self.write(*args, **opts) - if self._colormode == 'win32': - for a in args: - win32print(a, super(colorui, self).write_err, **opts) - else: - return super(colorui, self).write_err( - *[self.label(a, label) for a in args], **opts) - - def showlabel(self, msg, label): - if label and msg: - if msg[-1] == '\n': - return "[%s|%s]\n" % (label, msg[:-1]) - else: - return "[%s|%s]" % (label, msg) - else: - return msg - - def label(self, msg, label): - if self._colormode is None: - return super(colorui, self).label(msg, label) - - if self._colormode == 'debug': - return self.showlabel(msg, label) - - effects = [] - for l in label.split(): - s = color._styles.get(l, '') - if s: - effects.append(s) - elif valideffect(l): - effects.append(l) - effects = ' '.join(effects) - if effects: - return '\n'.join([render_effects(line, effects) - for line in msg.split('\n')]) - return msg - -def uisetup(ui): - if ui.plain(): - return - if not isinstance(ui, colorui): - colorui.__bases__ = (ui.__class__,) - ui.__class__ = colorui - def colorcmd(orig, ui_, opts, cmd, cmdfunc): - mode = _modesetup(ui_, opts['color']) - colorui._colormode = mode - if mode and mode != 'debug': - configstyles(ui_) - return orig(ui_, opts, cmd, cmdfunc) - def colorgit(orig, gitsub, commands, env=None, stream=False, cwd=None): - if gitsub.ui._colormode and len(commands) and commands[0] == "diff": - # insert the argument in the front, - # the end of git diff arguments is used for paths - commands.insert(1, '--color') - return orig(gitsub, commands, env, stream, cwd) - extensions.wrapfunction(dispatch, '_runcommand', colorcmd) - extensions.wrapfunction(subrepo.gitsubrepo, '_gitnodir', colorgit) - def extsetup(ui): - commands.globalopts.append( - ('', 'color', 'auto', - # i18n: 'always', 'auto', 'never', and 'debug' are keywords - # and should not be translated - _("when to colorize (boolean, always, auto, never, or debug)"), - _('TYPE'))) - -@command('debugcolor', - [('', 'style', None, _('show all configured styles'))], - 'hg debugcolor') -def debugcolor(ui, repo, **opts): - """show available color, effects or style""" - ui.write(('color mode: %s\n') % ui._colormode) - if opts.get('style'): - return _debugdisplaystyle(ui) - else: - return _debugdisplaycolor(ui) - -def _debugdisplaycolor(ui): - oldstyle = color._styles.copy() - try: - color._styles.clear() - for effect in _effects.keys(): - color._styles[effect] = effect - if _terminfo_params: - for k, v in ui.configitems('color'): - if k.startswith('color.'): - color._styles[k] = k[6:] - elif k.startswith('terminfo.'): - color._styles[k] = k[9:] - ui.write(_('available colors:\n')) - # sort label with a '_' after the other to group '_background' entry. - items = sorted(color._styles.items(), - key=lambda i: ('_' in i[0], i[0], i[1])) - for colorname, label in items: - ui.write(('%s\n') % colorname, label=label) - finally: - color._styles.clear() - color._styles.update(oldstyle) - -def _debugdisplaystyle(ui): - ui.write(_('available style:\n')) - width = max(len(s) for s in color._styles) - for label, effects in sorted(color._styles.items()): - ui.write('%s' % label, label=label) - if effects: - # 50 - ui.write(': ') - ui.write(' ' * (max(0, width - len(label)))) - ui.write(', '.join(ui.label(e, e) for e in effects.split())) - ui.write('\n') - -if pycompat.osname != 'nt': - w32effects = None -else: - import ctypes - import re - - _kernel32 = ctypes.windll.kernel32 - - _WORD = ctypes.c_ushort - - _INVALID_HANDLE_VALUE = -1 - - class _COORD(ctypes.Structure): - _fields_ = [('X', ctypes.c_short), - ('Y', ctypes.c_short)] - - class _SMALL_RECT(ctypes.Structure): - _fields_ = [('Left', ctypes.c_short), - ('Top', ctypes.c_short), - ('Right', ctypes.c_short), - ('Bottom', ctypes.c_short)] - - class _CONSOLE_SCREEN_BUFFER_INFO(ctypes.Structure): - _fields_ = [('dwSize', _COORD), - ('dwCursorPosition', _COORD), - ('wAttributes', _WORD), - ('srWindow', _SMALL_RECT), - ('dwMaximumWindowSize', _COORD)] - - _STD_OUTPUT_HANDLE = 0xfffffff5 # (DWORD)-11 - _STD_ERROR_HANDLE = 0xfffffff4 # (DWORD)-12 - - _FOREGROUND_BLUE = 0x0001 - _FOREGROUND_GREEN = 0x0002 - _FOREGROUND_RED = 0x0004 - _FOREGROUND_INTENSITY = 0x0008 - - _BACKGROUND_BLUE = 0x0010 - _BACKGROUND_GREEN = 0x0020 - _BACKGROUND_RED = 0x0040 - _BACKGROUND_INTENSITY = 0x0080 - - _COMMON_LVB_REVERSE_VIDEO = 0x4000 - _COMMON_LVB_UNDERSCORE = 0x8000 - - # http://msdn.microsoft.com/en-us/library/ms682088%28VS.85%29.aspx - w32effects = { - 'none': -1, - 'black': 0, - 'red': _FOREGROUND_RED, - 'green': _FOREGROUND_GREEN, - 'yellow': _FOREGROUND_RED | _FOREGROUND_GREEN, - 'blue': _FOREGROUND_BLUE, - 'magenta': _FOREGROUND_BLUE | _FOREGROUND_RED, - 'cyan': _FOREGROUND_BLUE | _FOREGROUND_GREEN, - 'white': _FOREGROUND_RED | _FOREGROUND_GREEN | _FOREGROUND_BLUE, - 'bold': _FOREGROUND_INTENSITY, - 'black_background': 0x100, # unused value > 0x0f - 'red_background': _BACKGROUND_RED, - 'green_background': _BACKGROUND_GREEN, - 'yellow_background': _BACKGROUND_RED | _BACKGROUND_GREEN, - 'blue_background': _BACKGROUND_BLUE, - 'purple_background': _BACKGROUND_BLUE | _BACKGROUND_RED, - 'cyan_background': _BACKGROUND_BLUE | _BACKGROUND_GREEN, - 'white_background': (_BACKGROUND_RED | _BACKGROUND_GREEN | - _BACKGROUND_BLUE), - 'bold_background': _BACKGROUND_INTENSITY, - 'underline': _COMMON_LVB_UNDERSCORE, # double-byte charsets only - 'inverse': _COMMON_LVB_REVERSE_VIDEO, # double-byte charsets only - } - - passthrough = set([_FOREGROUND_INTENSITY, - _BACKGROUND_INTENSITY, - _COMMON_LVB_UNDERSCORE, - _COMMON_LVB_REVERSE_VIDEO]) - - stdout = _kernel32.GetStdHandle( - _STD_OUTPUT_HANDLE) # don't close the handle returned - if stdout is None or stdout == _INVALID_HANDLE_VALUE: - w32effects = None - else: - csbi = _CONSOLE_SCREEN_BUFFER_INFO() - if not _kernel32.GetConsoleScreenBufferInfo( - stdout, ctypes.byref(csbi)): - # stdout may not support GetConsoleScreenBufferInfo() - # when called from subprocess or redirected - w32effects = None - else: - origattr = csbi.wAttributes - ansire = re.compile('\033\[([^m]*)m([^\033]*)(.*)', - re.MULTILINE | re.DOTALL) - - def win32print(text, orig, **opts): - label = opts.get('label', '') - attr = origattr - - def mapcolor(val, attr): - if val == -1: - return origattr - elif val in passthrough: - return attr | val - elif val > 0x0f: - return (val & 0x70) | (attr & 0x8f) - else: - return (val & 0x07) | (attr & 0xf8) - - # determine console attributes based on labels - for l in label.split(): - style = color._styles.get(l, '') - for effect in style.split(): - try: - attr = mapcolor(w32effects[effect], attr) - except KeyError: - # w32effects could not have certain attributes so we skip - # them if not found - pass - # hack to ensure regexp finds data - if not text.startswith('\033['): - text = '\033[m' + text - - # Look for ANSI-like codes embedded in text - m = re.match(ansire, text) - - try: - while m: - for sattr in m.group(1).split(';'): - if sattr: - attr = mapcolor(int(sattr), attr) - _kernel32.SetConsoleTextAttribute(stdout, attr) - orig(m.group(2), **opts) - m = re.match(ansire, m.group(3)) - finally: - # Explicitly reset original attributes - _kernel32.SetConsoleTextAttribute(stdout, origattr) + # change default color config + color._enabledbydefault = True diff -r ed5b25874d99 -r 4baf79a77afa hgext/convert/cvsps.py --- a/hgext/convert/cvsps.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/convert/cvsps.py Fri Mar 24 08:37:26 2017 -0700 @@ -622,7 +622,7 @@ # Sort changesets by date odd = set() - def cscmp(l, r, odd=odd): + def cscmp(l, r): d = sum(l.date) - sum(r.date) if d: return d diff -r ed5b25874d99 -r 4baf79a77afa hgext/convert/hg.py --- a/hgext/convert/hg.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/convert/hg.py Fri Mar 24 08:37:26 2017 -0700 @@ -90,10 +90,10 @@ self.wlock.release() def revmapfile(self): - return self.repo.join("shamap") + return self.repo.vfs.join("shamap") def authorfile(self): - return self.repo.join("authormap") + return self.repo.vfs.join("authormap") def setbranch(self, branch, pbranches): if not self.clonebranches: @@ -625,7 +625,7 @@ def converted(self, rev, destrev): if self.convertfp is None: - self.convertfp = open(self.repo.join('shamap'), 'a') + self.convertfp = open(self.repo.vfs.join('shamap'), 'a') self.convertfp.write('%s %s\n' % (destrev, rev)) self.convertfp.flush() diff -r ed5b25874d99 -r 4baf79a77afa hgext/convert/p4.py --- a/hgext/convert/p4.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/convert/p4.py Fri Mar 24 08:37:26 2017 -0700 @@ -161,7 +161,12 @@ d = self._fetch_revision(change) c = self._construct_commit(d, parents) - shortdesc = c.desc.splitlines(True)[0].rstrip('\r\n') + descarr = c.desc.splitlines(True) + if len(descarr) > 0: + shortdesc = descarr[0].rstrip('\r\n') + else: + shortdesc = '**empty changelist description**' + t = '%s %s' % (c.rev, repr(shortdesc)[1:-1]) ui.status(util.ellipsis(t, 80) + '\n') diff -r ed5b25874d99 -r 4baf79a77afa hgext/convert/subversion.py --- a/hgext/convert/subversion.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/convert/subversion.py Fri Mar 24 08:37:26 2017 -0700 @@ -13,8 +13,8 @@ encoding, error, pycompat, - scmutil, util, + vfs as vfsmod, ) from . import common @@ -1146,8 +1146,8 @@ self.run0('checkout', path, wcpath) self.wc = wcpath - self.opener = scmutil.opener(self.wc) - self.wopener = scmutil.opener(self.wc) + self.opener = vfsmod.vfs(self.wc) + self.wopener = vfsmod.vfs(self.wc) self.childmap = mapfile(ui, self.join('hg-childmap')) if util.checkexec(self.wc): self.is_exec = util.isexec @@ -1186,7 +1186,7 @@ # best bet is to assume they are in local # encoding. They will be passed to command line calls # later anyway, so they better be. - m.add(encoding.tolocal(name.encode('utf-8'))) + m.add(encoding.unitolocal(name)) break return m @@ -1306,7 +1306,7 @@ self.setexec = [] fd, messagefile = tempfile.mkstemp(prefix='hg-convert-') - fp = os.fdopen(fd, 'w') + fp = os.fdopen(fd, pycompat.sysstr('w')) fp.write(commit.desc) fp.close() try: diff -r ed5b25874d99 -r 4baf79a77afa hgext/eol.py --- a/hgext/eol.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/eol.py Fri Mar 24 08:37:26 2017 -0700 @@ -223,7 +223,7 @@ if node is None: # Cannot use workingctx.data() since it would load # and cache the filters before we configure them. - data = repo.wfile('.hgeol').read() + data = repo.wvfs('.hgeol').read() else: data = repo[node]['.hgeol'].data() return eolfile(ui, repo.root, data) @@ -314,7 +314,7 @@ oldeol = None try: - cachemtime = os.path.getmtime(self.join("eol.cache")) + cachemtime = os.path.getmtime(self.vfs.join("eol.cache")) except OSError: cachemtime = 0 else: diff -r ed5b25874d99 -r 4baf79a77afa hgext/extdiff.py --- a/hgext/extdiff.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/extdiff.py Fri Mar 24 08:37:26 2017 -0700 @@ -273,7 +273,7 @@ cmdline = re.sub(regex, quote, cmdline) ui.debug('running %r in %s\n' % (cmdline, tmproot)) - ui.system(cmdline, cwd=tmproot) + ui.system(cmdline, cwd=tmproot, blockedtag='extdiff') for copy_fn, working_fn, mtime in fns_and_mtime: if os.lstat(copy_fn).st_mtime != mtime: @@ -342,7 +342,7 @@ def __init__(self, path, cmdline): # We can't pass non-ASCII through docstrings (and path is # in an unknown encoding anyway) - docpath = path.encode("string-escape") + docpath = util.escapestr(path) self.__doc__ = self.__doc__ % {'path': util.uirepr(docpath)} self._cmdline = cmdline diff -r ed5b25874d99 -r 4baf79a77afa hgext/fsmonitor/state.py --- a/hgext/fsmonitor/state.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/fsmonitor/state.py Fri Mar 24 08:37:26 2017 -0700 @@ -20,7 +20,7 @@ class state(object): def __init__(self, repo): - self._opener = repo.opener + self._vfs = repo.vfs self._ui = repo.ui self._rootdir = pathutil.normasprefix(repo.root) self._lastclock = None @@ -33,7 +33,7 @@ def get(self): try: - file = self._opener('fsmonitor.state', 'rb') + file = self._vfs('fsmonitor.state', 'rb') except IOError as inst: if inst.errno != errno.ENOENT: raise @@ -91,7 +91,7 @@ return try: - file = self._opener('fsmonitor.state', 'wb', atomictemp=True) + file = self._vfs('fsmonitor.state', 'wb', atomictemp=True) except (IOError, OSError): self._ui.warn(_("warning: unable to write out fsmonitor state\n")) return diff -r ed5b25874d99 -r 4baf79a77afa hgext/gpg.py --- a/hgext/gpg.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/gpg.py Fri Mar 24 08:37:26 2017 -0700 @@ -18,6 +18,7 @@ error, match, node as hgnode, + pycompat, util, ) @@ -44,11 +45,11 @@ try: # create temporary files fd, sigfile = tempfile.mkstemp(prefix="hg-gpg-", suffix=".sig") - fp = os.fdopen(fd, 'wb') + fp = os.fdopen(fd, pycompat.sysstr('wb')) fp.write(sig) fp.close() fd, datafile = tempfile.mkstemp(prefix="hg-gpg-", suffix=".txt") - fp = os.fdopen(fd, 'wb') + fp = os.fdopen(fd, pycompat.sysstr('wb')) fp.write(data) fp.close() gpgcmd = ("%s --logger-fd 1 --status-fd 1 --verify " @@ -280,7 +281,7 @@ raise error.Abort(_("working copy of .hgsigs is changed "), hint=_("please commit .hgsigs manually")) - sigsfile = repo.wfile(".hgsigs", "ab") + sigsfile = repo.wvfs(".hgsigs", "ab") sigsfile.write(sigmessage) sigsfile.close() diff -r ed5b25874d99 -r 4baf79a77afa hgext/hgk.py --- a/hgext/hgk.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/hgk.py Fri Mar 24 08:37:26 2017 -0700 @@ -71,8 +71,10 @@ inferrepo=True) def difftree(ui, repo, node1=None, node2=None, *files, **opts): """diff trees from two commits""" - def __difftree(repo, node1, node2, files=[]): + def __difftree(repo, node1, node2, files=None): assert node2 is not None + if files is None: + files = [] mmap = repo[node1].manifest() mmap2 = repo[node2].manifest() m = scmutil.match(repo[node1], files) @@ -345,4 +347,4 @@ cmd = ui.config("hgk", "path", "hgk") + " %s %s" % (optstr, " ".join(etc)) ui.debug("running %s\n" % cmd) - ui.system(cmd) + ui.system(cmd, blockedtag='hgk_view') diff -r ed5b25874d99 -r 4baf79a77afa hgext/histedit.py --- a/hgext/histedit.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/histedit.py Fri Mar 24 08:37:26 2017 -0700 @@ -36,7 +36,7 @@ # p, pick = use commit # e, edit = use commit, but stop for amending # f, fold = use commit, but combine it with the one above - # r, roll = like fold, but discard this commit's description + # r, roll = like fold, but discard this commit's description and date # d, drop = remove commit from history # m, mess = edit commit message without changing commit content # @@ -58,7 +58,7 @@ # p, pick = use commit # e, edit = use commit, but stop for amending # f, fold = use commit, but combine it with the one above - # r, roll = like fold, but discard this commit's description + # r, roll = like fold, but discard this commit's description and date # d, drop = remove commit from history # m, mess = edit commit message without changing commit content # @@ -71,11 +71,11 @@ *** Add delta -Edit the commit message to your liking, then close the editor. For -this example, let's assume that the commit message was changed to -``Add beta and delta.`` After histedit has run and had a chance to -remove any old or temporary revisions it needed, the history looks -like this:: +Edit the commit message to your liking, then close the editor. The date used +for the commit will be the later of the two commits' dates. For this example, +let's assume that the commit message was changed to ``Add beta and delta.`` +After histedit has run and had a chance to remove any old or temporary +revisions it needed, the history looks like this:: @ 2[tip] 989b4d060121 2009-04-27 18:04 -0500 durin42 | Add beta and delta. @@ -97,9 +97,10 @@ allowing you to edit files freely, or even use ``hg record`` to commit some changes as a separate commit. When you're done, any remaining uncommitted changes will be committed as well. When done, run ``hg -histedit --continue`` to finish this step. You'll be prompted for a -new commit message, but the default commit message will be the -original message for the ``edit`` ed revision. +histedit --continue`` to finish this step. If there are uncommitted +changes, you'll be prompted for a new commit message, but the default +commit message will be the original message for the ``edit`` ed +revision, and the date of the original commit will be preserved. The ``message`` operation will give you a chance to revise a commit message without changing the contents. It's a shortcut for doing @@ -167,6 +168,15 @@ [histedit] dropmissing = True +By default, histedit will close the transaction after each action. For +performance purposes, you can configure histedit to use a single transaction +across the entire histedit. WARNING: This setting introduces a significant risk +of losing the work you've done in a histedit if the histedit aborts +unexpectedly:: + + [histedit] + singletransaction = True + """ from __future__ import absolute_import @@ -268,6 +278,7 @@ self.lock = lock self.wlock = wlock self.backupfile = None + self.tr = None if replacements is None: self.replacements = [] else: @@ -299,8 +310,15 @@ self.replacements = replacements self.backupfile = backupfile - def write(self): - fp = self.repo.vfs('histedit-state', 'w') + def write(self, tr=None): + if tr: + tr.addfilegenerator('histedit-state', ('histedit-state',), + self._write, location='plain') + else: + with self.repo.vfs("histedit-state", "w") as f: + self._write(f) + + def _write(self, fp): fp.write('v1\n') fp.write('%s\n' % node.hex(self.parentctxnode)) fp.write('%s\n' % node.hex(self.topmost)) @@ -316,7 +334,6 @@ if not backupfile: backupfile = '' fp.write('%s\n' % backupfile) - fp.close() def _load(self): fp = self.repo.vfs('histedit-state', 'r') @@ -501,16 +518,12 @@ """ phasemin = src.phase() def commitfunc(**kwargs): - phasebackup = repo.ui.backupconfig('phases', 'new-commit') - try: - repo.ui.setconfig('phases', 'new-commit', phasemin, - 'histedit') + overrides = {('phases', 'new-commit'): phasemin} + with repo.ui.configoverride(overrides, 'histedit'): extra = kwargs.get('extra', {}).copy() extra['histedit_source'] = src.hex() kwargs['extra'] = extra return repo.commit(**kwargs) - finally: - repo.ui.restoreconfig(phasebackup) return commitfunc def applychanges(ui, repo, ctx, opts): @@ -724,6 +737,15 @@ """ return True + def firstdate(self): + """Returns true if the rule should preserve the date of the first + change. + + This exists mainly so that 'rollup' rules can be a subclass of + 'fold'. + """ + return False + def finishfold(self, ui, repo, ctx, oldctx, newnode, internalchanges): parent = ctx.parents()[0].node() repo.ui.pushbuffer() @@ -742,21 +764,21 @@ [oldctx.description()]) + '\n' commitopts['message'] = newmessage # date - commitopts['date'] = max(ctx.date(), oldctx.date()) + if self.firstdate(): + commitopts['date'] = ctx.date() + else: + commitopts['date'] = max(ctx.date(), oldctx.date()) extra = ctx.extra().copy() # histedit_source # note: ctx is likely a temporary commit but that the best we can do # here. This is sufficient to solve issue3681 anyway. extra['histedit_source'] = '%s,%s' % (ctx.hex(), oldctx.hex()) commitopts['extra'] = extra - phasebackup = repo.ui.backupconfig('phases', 'new-commit') - try: - phasemin = max(ctx.phase(), oldctx.phase()) - repo.ui.setconfig('phases', 'new-commit', phasemin, 'histedit') + phasemin = max(ctx.phase(), oldctx.phase()) + overrides = {('phases', 'new-commit'): phasemin} + with repo.ui.configoverride(overrides, 'histedit'): n = collapse(repo, ctx, repo[newnode], commitopts, skipprompt=self.skipprompt()) - finally: - repo.ui.restoreconfig(phasebackup) if n is None: return ctx, [] repo.ui.pushbuffer() @@ -809,7 +831,7 @@ return True @action(["roll", "r"], - _("like fold, but discard this commit's description")) + _("like fold, but discard this commit's description and date")) class rollup(fold): def mergedescs(self): return False @@ -817,6 +839,9 @@ def skipprompt(self): return True + def firstdate(self): + return True + @action(["drop", "d"], _('remove commit from history')) class drop(histeditaction): @@ -884,11 +909,11 @@ - `mess` to reword the changeset commit message - - `fold` to combine it with the preceding changeset + - `fold` to combine it with the preceding changeset (using the later date) - - `roll` like fold, but discarding this commit's description + - `roll` like fold, but discarding this commit's description and date - - `edit` to edit this changeset + - `edit` to edit this changeset (preserving date) There are a number of ways to select the root changeset: @@ -992,7 +1017,8 @@ def _readfile(ui, path): if path == '-': - return ui.fin.read() + with ui.timeblockedsection('histedit'): + return ui.fin.read() else: with open(path, 'rb') as f: return f.read() @@ -1082,17 +1108,45 @@ total = len(state.actions) pos = 0 - while state.actions: - state.write() - actobj = state.actions.pop(0) - pos += 1 - ui.progress(_("editing"), pos, actobj.torule(), - _('changes'), total) - ui.debug('histedit: processing %s %s\n' % (actobj.verb,\ - actobj.torule())) - parentctx, replacement_ = actobj.run() - state.parentctxnode = parentctx.node() - state.replacements.extend(replacement_) + state.tr = None + + # Force an initial state file write, so the user can run --abort/continue + # even if there's an exception before the first transaction serialize. + state.write() + try: + # Don't use singletransaction by default since it rolls the entire + # transaction back if an unexpected exception happens (like a + # pretxncommit hook throws, or the user aborts the commit msg editor). + if ui.configbool("histedit", "singletransaction", False): + # Don't use a 'with' for the transaction, since actions may close + # and reopen a transaction. For example, if the action executes an + # external process it may choose to commit the transaction first. + state.tr = repo.transaction('histedit') + + while state.actions: + state.write(tr=state.tr) + actobj = state.actions[0] + pos += 1 + ui.progress(_("editing"), pos, actobj.torule(), + _('changes'), total) + ui.debug('histedit: processing %s %s\n' % (actobj.verb,\ + actobj.torule())) + parentctx, replacement_ = actobj.run() + state.parentctxnode = parentctx.node() + state.replacements.extend(replacement_) + state.actions.pop(0) + + if state.tr is not None: + state.tr.close() + except error.InterventionRequired: + if state.tr is not None: + state.tr.close() + raise + except Exception: + if state.tr is not None: + state.tr.abort() + raise + state.write() ui.progress(_("editing"), None) @@ -1115,29 +1169,13 @@ for n in succs[1:]: ui.debug(m % node.short(n)) - supportsmarkers = obsolete.isenabled(repo, obsolete.createmarkersopt) - if supportsmarkers: - # Only create markers if the temp nodes weren't already removed. - obsolete.createmarkers(repo, ((repo[t],()) for t in sorted(tmpnodes) - if t in repo)) - else: - cleanupnode(ui, repo, 'temp', tmpnodes) + safecleanupnode(ui, repo, 'temp', tmpnodes) if not state.keep: if mapping: movebookmarks(ui, repo, mapping, state.topmost, ntm) # TODO update mq state - if supportsmarkers: - markers = [] - # sort by revision number because it sound "right" - for prec in sorted(mapping, key=repo.changelog.rev): - succs = mapping[prec] - markers.append((repo[prec], - tuple(repo[s] for s in succs))) - if markers: - obsolete.createmarkers(repo, markers) - else: - cleanupnode(ui, repo, 'replaced', mapping) + safecleanupnode(ui, repo, 'replaced', mapping) state.clear() if os.path.exists(repo.sjoin('undo')): @@ -1154,7 +1192,7 @@ # Recover our old commits if necessary if not state.topmost in repo and state.backupfile: - backupfile = repo.join(state.backupfile) + backupfile = repo.vfs.join(state.backupfile) f = hg.openpath(ui, backupfile) gen = exchange.readbundle(ui, f, backupfile) with repo.transaction('histedit.abort') as tr: @@ -1171,8 +1209,8 @@ if repo.unfiltered().revs('parents() and (%n or %ln::)', state.parentctxnode, leafs | tmpnodes): hg.clean(repo, state.topmost, show_stats=True, quietempty=True) - cleanupnode(ui, repo, 'created', tmpnodes) - cleanupnode(ui, repo, 'temp', leafs) + safecleanupnode(ui, repo, 'created', tmpnodes) + safecleanupnode(ui, repo, 'temp', leafs) except Exception: if state.inprogress(): ui.warn(_('warning: encountered an exception during histedit ' @@ -1340,7 +1378,7 @@ # Save edit rules in .hg/histedit-last-edit.txt in case # the user needs to ask for help after something # surprising happens. - f = open(repo.join('histedit-last-edit.txt'), 'w') + f = open(repo.vfs.join('histedit-last-edit.txt'), 'w') f.write(rules) f.close() @@ -1542,27 +1580,49 @@ finally: release(tr, lock) -def cleanupnode(ui, repo, name, nodes): - """strip a group of nodes from the repository +def safecleanupnode(ui, repo, name, nodes): + """strip or obsolete nodes - The set of node to strip may contains unknown nodes.""" - ui.debug('should strip %s nodes %s\n' % - (name, ', '.join([node.short(n) for n in nodes]))) - with repo.lock(): - # do not let filtering get in the way of the cleanse - # we should probably get rid of obsolescence marker created during the - # histedit, but we currently do not have such information. - repo = repo.unfiltered() - # Find all nodes that need to be stripped - # (we use %lr instead of %ln to silently ignore unknown items) - nm = repo.changelog.nodemap - nodes = sorted(n for n in nodes if n in nm) - roots = [c.node() for c in repo.set("roots(%ln)", nodes)] - for c in roots: - # We should process node in reverse order to strip tip most first. - # but this trigger a bug in changegroup hook. - # This would reduce bundle overhead - repair.strip(ui, repo, c) + nodes could be either a set or dict which maps to replacements. + nodes could be unknown (outside the repo). + """ + supportsmarkers = obsolete.isenabled(repo, obsolete.createmarkersopt) + if supportsmarkers: + if util.safehasattr(nodes, 'get'): + # nodes is a dict-like mapping + # use unfiltered repo for successors in case they are hidden + urepo = repo.unfiltered() + def getmarker(prec): + succs = tuple(urepo[n] for n in nodes.get(prec, ())) + return (repo[prec], succs) + else: + # nodes is a set-like + def getmarker(prec): + return (repo[prec], ()) + # sort by revision number because it sound "right" + sortednodes = sorted([n for n in nodes if n in repo], + key=repo.changelog.rev) + markers = [getmarker(t) for t in sortednodes] + if markers: + obsolete.createmarkers(repo, markers) + else: + ui.debug('should strip %s nodes %s\n' % + (name, ', '.join([node.short(n) for n in nodes]))) + with repo.lock(): + # Do not let filtering get in the way of the cleanse we should + # probably get rid of obsolescence marker created during the + # histedit, but we currently do not have such information. + repo = repo.unfiltered() + # Find all nodes that need to be stripped + # (we use %lr instead of %ln to silently ignore unknown items) + nm = repo.changelog.nodemap + nodes = sorted(n for n in nodes if n in nm) + roots = [c.node() for c in repo.set("roots(%ln)", nodes)] + for c in roots: + # We should process node in reverse order to strip tip most + # first, but this trigger a bug in changegroup hook. This + # would reduce bundle overhead + repair.strip(ui, repo, c) def stripwrapper(orig, ui, repo, nodelist, *args, **kwargs): if isinstance(nodelist, str): @@ -1581,7 +1641,7 @@ extensions.wrapfunction(repair, 'strip', stripwrapper) def summaryhook(ui, repo): - if not os.path.exists(repo.join('histedit-state')): + if not os.path.exists(repo.vfs.join('histedit-state')): return state = histeditstate(repo) state.read() diff -r ed5b25874d99 -r 4baf79a77afa hgext/journal.py --- a/hgext/journal.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/journal.py Fri Mar 24 08:37:26 2017 -0700 @@ -4,7 +4,7 @@ # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. -"""Track previous positions of bookmarks (EXPERIMENTAL) +"""track previous positions of bookmarks (EXPERIMENTAL) This extension adds a new command: `hg journal`, which shows you where bookmarks were previously located. @@ -163,7 +163,7 @@ # to copy. move shared date over from source to destination but # move the local file first if repo.vfs.exists('namejournal'): - journalpath = repo.join('namejournal') + journalpath = repo.vfs.join('namejournal') util.rename(journalpath, journalpath + '.bak') storage = repo.journal local = storage._open( diff -r ed5b25874d99 -r 4baf79a77afa hgext/keyword.py --- a/hgext/keyword.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/keyword.py Fri Mar 24 08:37:26 2017 -0700 @@ -438,7 +438,7 @@ # simulate hgrc parsing rcmaps = '[keywordmaps]\n%s\n' % '\n'.join(args) repo.vfs.write('hgrc', rcmaps) - ui.readconfig(repo.join('hgrc')) + ui.readconfig(repo.vfs.join('hgrc')) kwmaps = dict(ui.configitems('keywordmaps')) elif opts.get('default'): if svn: diff -r ed5b25874d99 -r 4baf79a77afa hgext/largefiles/lfutil.py --- a/hgext/largefiles/lfutil.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/largefiles/lfutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -27,6 +27,7 @@ pycompat, scmutil, util, + vfs as vfsmod, ) shortname = '.hglf' @@ -144,7 +145,7 @@ ''' vfs = repo.vfs lfstoredir = longname - opener = scmutil.opener(vfs.join(lfstoredir)) + opener = vfsmod.vfs(vfs.join(lfstoredir)) lfdirstate = largefilesdirstate(opener, ui, repo.root, repo.dirstate._validate) @@ -201,7 +202,7 @@ file with the given hash.''' if not forcelocal and repo.shared(): return repo.vfs.reljoin(repo.sharedpath, longname, hash) - return repo.join(longname, hash) + return repo.vfs.join(longname, hash) def findstorepath(repo, hash): '''Search through the local store path(s) to find the file for the given diff -r ed5b25874d99 -r 4baf79a77afa hgext/largefiles/overrides.py --- a/hgext/largefiles/overrides.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/largefiles/overrides.py Fri Mar 24 08:37:26 2017 -0700 @@ -22,8 +22,8 @@ match as matchmod, pathutil, registrar, - revset, scmutil, + smartset, util, ) @@ -223,7 +223,7 @@ if not opts.get('dry_run'): if not after: - util.unlinkpath(repo.wjoin(f), ignoremissing=True) + repo.wvfs.unlinkpath(f, ignoremissing=True) if opts.get('dry_run'): return result @@ -233,7 +233,7 @@ # function handle this. if not isaddremove: for f in remove: - util.unlinkpath(repo.wjoin(f), ignoremissing=True) + repo.wvfs.unlinkpath(f, ignoremissing=True) repo[None].forget(remove) for f in remove: @@ -694,7 +694,7 @@ # The file is gone, but this deletes any empty parent # directories as a side-effect. - util.unlinkpath(repo.wjoin(srclfile), True) + repo.wvfs.unlinkpath(srclfile, ignoremissing=True) lfdirstate.remove(srclfile) else: util.copyfile(repo.wjoin(srclfile), @@ -855,7 +855,7 @@ firstpulled = repo.firstpulled except AttributeError: raise error.Abort(_("pulled() only available in --lfrev")) - return revset.baseset([r for r in subset if r >= firstpulled]) + return smartset.baseset([r for r in subset if r >= firstpulled]) def overrideclone(orig, ui, source, dest=None, **opts): d = dest @@ -993,9 +993,9 @@ archiver.done() -def hgsubrepoarchive(orig, repo, archiver, prefix, match=None): +def hgsubrepoarchive(orig, repo, archiver, prefix, match=None, decode=True): if not repo._repo.lfstatus: - return orig(repo, archiver, prefix, match) + return orig(repo, archiver, prefix, match, decode) repo._get(repo._state + ('hg',)) rev = repo._state[1] @@ -1010,6 +1010,8 @@ if match and not match(f): return data = getdata() + if decode: + data = repo._repo.wwritedata(name, data) archiver.addfile(prefix + repo._path + '/' + name, mode, islink, data) @@ -1037,7 +1039,7 @@ sub = ctx.workingsub(subpath) submatch = matchmod.subdirmatcher(subpath, match) sub._repo.lfstatus = True - sub.archive(archiver, prefix + repo._path + '/', submatch) + sub.archive(archiver, prefix + repo._path + '/', submatch, decode) # If a largefile is modified, the change is not reflected in its # standin until a commit. cmdutil.bailifchanged() raises an exception @@ -1094,7 +1096,7 @@ lfdirstate.write() standins = [lfutil.standin(f) for f in forget] for f in standins: - util.unlinkpath(repo.wjoin(f), ignoremissing=True) + repo.wvfs.unlinkpath(f, ignoremissing=True) rejected = repo[None].forget(standins) bad.extend(f for f in rejected if f in m.files()) diff -r ed5b25874d99 -r 4baf79a77afa hgext/largefiles/reposetup.py --- a/hgext/largefiles/reposetup.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/largefiles/reposetup.py Fri Mar 24 08:37:26 2017 -0700 @@ -272,7 +272,9 @@ # contents updated to reflect the hash of their largefile. # Do that here. def commit(self, text="", user=None, date=None, match=None, - force=False, editor=False, extra={}): + force=False, editor=False, extra=None): + if extra is None: + extra = {} orig = super(lfilesrepo, self).commit with self.wlock(): diff -r ed5b25874d99 -r 4baf79a77afa hgext/logtoprocess.py --- a/hgext/logtoprocess.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/logtoprocess.py Fri Mar 24 08:37:26 2017 -0700 @@ -4,7 +4,7 @@ # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. -"""Send ui.log() data to a subprocess (EXPERIMENTAL) +"""send ui.log() data to a subprocess (EXPERIMENTAL) This extension lets you specify a shell command per ui.log() event, sending all remaining arguments to as environment variables to that command. diff -r ed5b25874d99 -r 4baf79a77afa hgext/mq.py --- a/hgext/mq.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/mq.py Fri Mar 24 08:37:26 2017 -0700 @@ -14,7 +14,7 @@ Known patches are represented as patch files in the .hg/patches directory. Applied patches are both patch files and changesets. -Common tasks (use :hg:`help command` for more details):: +Common tasks (use :hg:`help COMMAND` for more details):: create new patch qnew import existing patch qimport @@ -89,10 +89,12 @@ phases, pycompat, registrar, - revset, + revsetlang, scmutil, + smartset, subrepo, util, + vfs as vfsmod, ) release = lockmod.release @@ -403,18 +405,12 @@ if phase is None: if repo.ui.configbool('mq', 'secret', False): phase = phases.secret + overrides = {('ui', 'allowemptycommit'): True} if phase is not None: - phasebackup = repo.ui.backupconfig('phases', 'new-commit') - allowemptybackup = repo.ui.backupconfig('ui', 'allowemptycommit') - try: - if phase is not None: - repo.ui.setconfig('phases', 'new-commit', phase, 'mq') + overrides[('phases', 'new-commit')] = phase + with repo.ui.configoverride(overrides, 'mq'): repo.ui.setconfig('ui', 'allowemptycommit', True) return repo.commit(*args, **kwargs) - finally: - repo.ui.restoreconfig(allowemptybackup) - if phase is not None: - repo.ui.restoreconfig(phasebackup) class AbortNoCleanup(error.Abort): pass @@ -433,7 +429,7 @@ except IOError: curpath = os.path.join(path, 'patches') self.path = patchdir or curpath - self.opener = scmutil.opener(self.path) + self.opener = vfsmod.vfs(self.path) self.ui = ui self.baseui = baseui self.applieddirty = False @@ -719,7 +715,9 @@ util.rename(absf, absorig) def printdiff(self, repo, diffopts, node1, node2=None, files=None, - fp=None, changes=None, opts={}): + fp=None, changes=None, opts=None): + if opts is None: + opts = {} stat = opts.get('stat') m = scmutil.match(repo[node1], files, opts) cmdutil.diffordiffstat(self.ui, repo, diffopts, node1, node2, m, @@ -1118,6 +1116,10 @@ if name in self._reserved: raise error.Abort(_('"%s" cannot be used as the name of a patch') % name) + if name != name.strip(): + # whitespace is stripped by parseseries() + raise error.Abort(_('patch name cannot begin or end with ' + 'whitespace')) for prefix in ('.hg', '.mq'): if name.startswith(prefix): raise error.Abort(_('patch name cannot begin with "%s"') @@ -1477,7 +1479,7 @@ # created while patching for f in all_files: if f not in repo.dirstate: - util.unlinkpath(repo.wjoin(f), ignoremissing=True) + repo.wvfs.unlinkpath(f, ignoremissing=True) self.ui.warn(_('done\n')) raise @@ -1580,7 +1582,7 @@ self.backup(repo, tobackup) repo.dirstate.beginparentchange() for f in a: - util.unlinkpath(repo.wjoin(f), ignoremissing=True) + repo.wvfs.unlinkpath(f, ignoremissing=True) repo.dirstate.drop(f) for f in m + r: fctx = ctx[f] @@ -2675,6 +2677,7 @@ Returns 0 on success. """ + ui.pager('qdiff') repo.mq.diff(repo, pats, opts) return 0 @@ -2917,7 +2920,7 @@ opts = fixkeepchangesopts(ui, opts) if opts.get('merge'): if opts.get('name'): - newpath = repo.join(opts.get('name')) + newpath = repo.vfs.join(opts.get('name')) else: newpath, i = lastsavename(q.path) if not newpath: @@ -2957,7 +2960,7 @@ opts = fixkeepchangesopts(ui, opts) localupdate = True if opts.get('name'): - q = queue(ui, repo.baseui, repo.path, repo.join(opts.get('name'))) + q = queue(ui, repo.baseui, repo.path, repo.vfs.join(opts.get('name'))) ui.warn(_('using patch queue: %s\n') % q.path) localupdate = False else: @@ -3311,9 +3314,9 @@ def _queuedir(name): if name == 'patches': - return repo.join('patches') + return repo.vfs.join('patches') else: - return repo.join('patches-' + name) + return repo.vfs.join('patches-' + name) def _validname(name): for n in name: @@ -3336,7 +3339,7 @@ continue fh.write('%s\n' % (queue,)) fh.close() - util.rename(repo.join('patches.queues.new'), repo.join(_allqueues)) + repo.vfs.rename('patches.queues.new', _allqueues) if not name or opts.get('list') or opts.get('active'): current = _getcurrent() @@ -3389,7 +3392,7 @@ else: fh.write('%s\n' % (queue,)) fh.close() - util.rename(repo.join('patches.queues.new'), repo.join(_allqueues)) + repo.vfs.rename('patches.queues.new', _allqueues) _setactivenocheck(name) elif opts.get('delete'): _delete(name) @@ -3435,7 +3438,9 @@ raise error.Abort(errmsg) def commit(self, text="", user=None, date=None, match=None, - force=False, editor=False, extra={}): + force=False, editor=False, extra=None): + if extra is None: + extra = {} self.abortifwdirpatched( _('cannot commit over an applied mq patch'), force) @@ -3567,9 +3572,9 @@ def revsetmq(repo, subset, x): """Changesets managed by MQ. """ - revset.getargs(x, 0, 0, _("mq takes no arguments")) + revsetlang.getargs(x, 0, 0, _("mq takes no arguments")) applied = set([repo[r.node].rev() for r in repo.mq.applied]) - return revset.baseset([r for r in subset if r in applied]) + return smartset.baseset([r for r in subset if r in applied]) # tell hggettext to extract docstrings from these functions: i18nfunctions = [revsetmq] diff -r ed5b25874d99 -r 4baf79a77afa hgext/pager.py --- a/hgext/pager.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/pager.py Fri Mar 24 08:37:26 2017 -0700 @@ -12,68 +12,22 @@ # # Run 'hg help pager' to get info on configuration. -'''browse command output with an external pager - -To set the pager that should be used, set the application variable:: - - [pager] - pager = less -FRX - -If no pager is set, the pager extensions uses the environment variable -$PAGER. If neither pager.pager, nor $PAGER is set, no pager is used. - -You can disable the pager for certain commands by adding them to the -pager.ignore list:: +'''browse command output with an external pager (DEPRECATED) - [pager] - ignore = version, help, update - -You can also enable the pager only for certain commands using -pager.attend. Below is the default list of commands to be paged:: - - [pager] - attend = annotate, cat, diff, export, glog, log, qdiff - -Setting pager.attend to an empty value will cause all commands to be -paged. - -If pager.attend is present, pager.ignore will be ignored. - -Lastly, you can enable and disable paging for individual commands with -the attend- option. This setting takes precedence over -existing attend and ignore options and defaults:: +Forcibly enable paging for individual commands that don't typically +request pagination with the attend- option. This setting +takes precedence over ignore options and defaults:: [pager] attend-cat = false - -To ignore global commands like :hg:`version` or :hg:`help`, you have -to specify them in your user configuration file. - -To control whether the pager is used at all for an individual command, -you can use --pager=:: - - - use as needed: `auto`. - - require the pager: `yes` or `on`. - - suppress the pager: `no` or `off` (any unrecognized value - will also work). - ''' from __future__ import absolute_import -import atexit -import os -import signal -import subprocess -import sys - -from mercurial.i18n import _ from mercurial import ( cmdutil, commands, dispatch, - encoding, extensions, - util, ) # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for @@ -82,58 +36,12 @@ # leave the attribute unspecified. testedwith = 'ships-with-hg-core' -def _runpager(ui, p): - pager = subprocess.Popen(p, shell=True, bufsize=-1, - close_fds=util.closefds, stdin=subprocess.PIPE, - stdout=util.stdout, stderr=util.stderr) - - # back up original file objects and descriptors - olduifout = ui.fout - oldstdout = util.stdout - stdoutfd = os.dup(util.stdout.fileno()) - stderrfd = os.dup(util.stderr.fileno()) - - # create new line-buffered stdout so that output can show up immediately - ui.fout = util.stdout = newstdout = os.fdopen(util.stdout.fileno(), 'wb', 1) - os.dup2(pager.stdin.fileno(), util.stdout.fileno()) - if ui._isatty(util.stderr): - os.dup2(pager.stdin.fileno(), util.stderr.fileno()) - - @atexit.register - def killpager(): - if util.safehasattr(signal, "SIGINT"): - signal.signal(signal.SIGINT, signal.SIG_IGN) - pager.stdin.close() - ui.fout = olduifout - util.stdout = oldstdout - # close new stdout while it's associated with pager; otherwise stdout - # fd would be closed when newstdout is deleted - newstdout.close() - # restore original fds: stdout is open again - os.dup2(stdoutfd, util.stdout.fileno()) - os.dup2(stderrfd, util.stderr.fileno()) - pager.wait() - def uisetup(ui): - class pagerui(ui.__class__): - def _runpager(self, pagercmd): - _runpager(self, pagercmd) - - ui.__class__ = pagerui def pagecmd(orig, ui, options, cmd, cmdfunc): - p = ui.config("pager", "pager", encoding.environ.get("PAGER")) - usepager = False - always = util.parsebool(options['pager']) auto = options['pager'] == 'auto' - - if not p or '--debugger' in sys.argv or not ui.formatted(): - pass - elif always: - usepager = True - elif not auto: + if auto and not ui.pageractive: usepager = False - else: attend = ui.configlist('pager', 'attend', attended) ignore = ui.configlist('pager', 'ignore') cmds, _ = cmdutil.findcmd(cmd, commands.table) @@ -148,27 +56,18 @@ usepager = True break - setattr(ui, 'pageractive', usepager) - - if usepager: - ui.setconfig('ui', 'formatted', ui.formatted(), 'pager') - ui.setconfig('ui', 'interactive', False, 'pager') - if util.safehasattr(signal, "SIGPIPE"): - signal.signal(signal.SIGPIPE, signal.SIG_DFL) - ui._runpager(p) + if usepager: + # Slight hack: the attend list is supposed to override + # the ignore list for the pager extension, but the + # core code doesn't know about attend, so we have to + # lobotomize the ignore list so that the extension's + # behavior is preserved. + ui.setconfig('pager', 'ignore', '', 'pager') + ui.pager('extension-via-attend-' + cmd) + else: + ui.disablepager() return orig(ui, options, cmd, cmdfunc) - # Wrap dispatch._runcommand after color is loaded so color can see - # ui.pageractive. Otherwise, if we loaded first, color's wrapped - # dispatch._runcommand would run without having access to ui.pageractive. - def afterloaded(loaded): - extensions.wrapfunction(dispatch, '_runcommand', pagecmd) - extensions.afterloaded('color', afterloaded) - -def extsetup(ui): - commands.globalopts.append( - ('', 'pager', 'auto', - _("when to paginate (boolean, always, auto, or never)"), - _('TYPE'))) + extensions.wrapfunction(dispatch, '_runcommand', pagecmd) attended = ['annotate', 'cat', 'diff', 'export', 'glog', 'log', 'qdiff'] diff -r ed5b25874d99 -r 4baf79a77afa hgext/patchbomb.py --- a/hgext/patchbomb.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/patchbomb.py Fri Mar 24 08:37:26 2017 -0700 @@ -60,6 +60,14 @@ intro=never # never include an introduction message intro=always # always include an introduction message +You can specify a template for flags to be added in subject prefixes. Flags +specified by --flag option are exported as ``{flags}`` keyword:: + + [patchbomb] + flagtemplate = "{separate(' ', + ifeq(branch, 'default', '', branch|upper), + flags)}" + You can set patchbomb to always ask for confirmation by setting ``patchbomb.confirm`` to true. ''' @@ -75,13 +83,14 @@ from mercurial import ( cmdutil, commands, - encoding, error, + formatter, hg, mail, node as nodemod, patch, scmutil, + templater, util, ) stringio = util.stringio @@ -135,7 +144,32 @@ intro = 1 < number return intro -def makepatch(ui, repo, patchlines, opts, _charsets, idx, total, numbered, +def _formatflags(ui, repo, rev, flags): + """build flag string optionally by template""" + tmpl = ui.config('patchbomb', 'flagtemplate') + if not tmpl: + return ' '.join(flags) + out = util.stringio() + opts = {'template': templater.unquotestring(tmpl)} + with formatter.templateformatter(ui, out, 'patchbombflag', opts) as fm: + fm.startitem() + fm.context(ctx=repo[rev]) + fm.write('flags', '%s', fm.formatlist(flags, name='flag')) + return out.getvalue() + +def _formatprefix(ui, repo, rev, flags, idx, total, numbered): + """build prefix to patch subject""" + flag = _formatflags(ui, repo, rev, flags) + if flag: + flag = ' ' + flag + + if not numbered: + return '[PATCH%s]' % flag + else: + tlen = len(str(total)) + return '[PATCH %0*d of %d%s]' % (tlen, idx, total, flag) + +def makepatch(ui, repo, rev, patchlines, opts, _charsets, idx, total, numbered, patchname=None): desc = [] @@ -202,16 +236,13 @@ else: msg = mail.mimetextpatch(body, display=opts.get('test')) - flag = ' '.join(opts.get('flag')) - if flag: - flag = ' ' + flag - + prefix = _formatprefix(ui, repo, rev, opts.get('flag'), idx, total, + numbered) subj = desc[0].strip().rstrip('. ') if not numbered: - subj = '[PATCH%s] %s' % (flag, opts.get('subject') or subj) + subj = ' '.join([prefix, opts.get('subject') or subj]) else: - tlen = len(str(total)) - subj = '[PATCH %0*d of %d%s] %s' % (tlen, idx, total, flag, subj) + subj = ' '.join([prefix, subj]) msg['Subject'] = mail.headencode(ui, subj, _charsets, opts.get('test')) msg['X-Mercurial-Node'] = node msg['X-Mercurial-Series-Index'] = '%i' % idx @@ -303,19 +334,16 @@ msg['Subject'] = mail.headencode(ui, subj, _charsets, opts.get('test')) return [(msg, subj, None)] -def _makeintro(repo, sender, patches, **opts): +def _makeintro(repo, sender, revs, patches, **opts): """make an introduction email, asking the user for content if needed email is returned as (subject, body, cumulative-diffstat)""" ui = repo.ui _charsets = mail._charsets(ui) - tlen = len(str(len(patches))) - flag = opts.get('flag') or '' - if flag: - flag = ' ' + ' '.join(flag) - prefix = '[PATCH %0*d of %d%s]' % (tlen, 0, len(patches), flag) - + # use the last revision which is likely to be a bookmarked head + prefix = _formatprefix(ui, repo, revs.last(), opts.get('flag'), + 0, len(patches), numbered=True) subj = (opts.get('subject') or prompt(ui, '(optional) Subject: ', rest=prefix, default='')) if not subj: @@ -337,7 +365,7 @@ opts.get('test')) return (msg, subj, diffstat) -def _getpatchmsgs(repo, sender, patches, patchnames=None, **opts): +def _getpatchmsgs(repo, sender, revs, patchnames=None, **opts): """return a list of emails from a list of patches This involves introduction message creation if necessary. @@ -346,6 +374,7 @@ """ ui = repo.ui _charsets = mail._charsets(ui) + patches = list(_getpatches(repo, revs, **opts)) msgs = [] ui.write(_('this patch series consists of %d patches.\n\n') @@ -353,7 +382,7 @@ # build the intro message, or skip it if the user declines if introwanted(ui, opts, len(patches)): - msg = _makeintro(repo, sender, patches, **opts) + msg = _makeintro(repo, sender, revs, patches, **opts) if msg: msgs.append(msg) @@ -362,10 +391,11 @@ # now generate the actual patch messages name = None - for i, p in enumerate(patches): + assert len(revs) == len(patches) + for i, (r, p) in enumerate(zip(revs, patches)): if patchnames: name = patchnames[i] - msg = makepatch(ui, repo, p, opts, _charsets, i + 1, + msg = makepatch(ui, repo, r, p, opts, _charsets, i + 1, len(patches), numbered, name) msgs.append(msg) @@ -467,9 +497,7 @@ With -n/--test, all steps will run, but mail will not be sent. You will be prompted for an email recipient address, a subject and an introductory message describing the patches of your patchbomb. - Then when all is done, patchbomb messages are displayed. If the - PAGER environment variable is set, your pager will be fired up once - for each patchbomb message, so you can verify everything is alright. + Then when all is done, patchbomb messages are displayed. In case email sending fails, you will find a backup of your series introductory message in ``.hg/last-email.txt``. @@ -511,14 +539,12 @@ mbox = opts.get('mbox') outgoing = opts.get('outgoing') rev = opts.get('rev') - # internal option used by pbranches - patches = opts.get('patches') if not (opts.get('test') or mbox): # really sending mail.validateconfig(ui) - if not (revs or rev or outgoing or bundle or patches): + if not (revs or rev or outgoing or bundle): raise error.Abort(_('specify at least one changeset with -r or -o')) if outgoing and bundle: @@ -590,17 +616,13 @@ ui.config('patchbomb', 'from') or prompt(ui, 'From', ui.username())) - if patches: - msgs = _getpatchmsgs(repo, sender, patches, opts.get('patchnames'), - **opts) - elif bundle: + if bundle: bundledata = _getbundle(repo, dest, **opts) bundleopts = opts.copy() bundleopts.pop('bundle', None) # already processed msgs = _getbundlemsgs(repo, sender, bundledata, **bundleopts) else: - _patches = list(_getpatches(repo, revs, **opts)) - msgs = _getpatchmsgs(repo, sender, _patches, **opts) + msgs = _getpatchmsgs(repo, sender, revs, **opts) showaddrs = [] @@ -693,20 +715,14 @@ m['Reply-To'] = ', '.join(replyto) if opts.get('test'): ui.status(_('displaying '), subj, ' ...\n') - ui.flush() - if 'PAGER' in encoding.environ and not ui.plain(): - fp = util.popen(encoding.environ['PAGER'], 'w') - else: - fp = ui - generator = emailmod.Generator.Generator(fp, mangle_from_=False) + ui.pager('email') + generator = emailmod.Generator.Generator(ui, mangle_from_=False) try: generator.flatten(m, 0) - fp.write('\n') + ui.write('\n') except IOError as inst: if inst.errno != errno.EPIPE: raise - if fp is not ui: - fp.close() else: if not sendmail: sendmail = mail.connect(ui, mbox=mbox) diff -r ed5b25874d99 -r 4baf79a77afa hgext/rebase.py --- a/hgext/rebase.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/rebase.py Fri Mar 24 08:37:26 2017 -0700 @@ -47,6 +47,7 @@ repoview, revset, scmutil, + smartset, util, ) @@ -118,8 +119,8 @@ # i18n: "_rebasedefaultdest" is a keyword sourceset = None if x is not None: - sourceset = revset.getset(repo, revset.fullreposet(repo), x) - return subset & revset.baseset([_destrebase(repo, sourceset)]) + sourceset = revset.getset(repo, smartset.fullreposet(repo), x) + return subset & smartset.baseset([_destrebase(repo, sourceset)]) class rebaseruntime(object): """This class is a container for rebase runtime state""" @@ -158,6 +159,37 @@ self.keepopen = opts.get('keepopen', False) self.obsoletenotrebased = {} + def storestatus(self, tr=None): + """Store the current status to allow recovery""" + if tr: + tr.addfilegenerator('rebasestate', ('rebasestate',), + self._writestatus, location='plain') + else: + with self.repo.vfs("rebasestate", "w") as f: + self._writestatus(f) + + def _writestatus(self, f): + repo = self.repo + f.write(repo[self.originalwd].hex() + '\n') + f.write(repo[self.target].hex() + '\n') + f.write(repo[self.external].hex() + '\n') + f.write('%d\n' % int(self.collapsef)) + f.write('%d\n' % int(self.keepf)) + f.write('%d\n' % int(self.keepbranchesf)) + f.write('%s\n' % (self.activebookmark or '')) + for d, v in self.state.iteritems(): + oldrev = repo[d].hex() + if v >= 0: + newrev = repo[v].hex() + elif v == revtodo: + # To maintain format compatibility, we have to use nullid. + # Please do remove this special case when upgrading the format. + newrev = hex(nullid) + else: + newrev = v + f.write("%s:%s\n" % (oldrev, newrev)) + repo.ui.debug('rebase status stored\n') + def restorestatus(self): """Restore a previously stored status""" repo = self.repo @@ -218,7 +250,7 @@ repo.ui.debug('computed skipped revs: %s\n' % (' '.join(str(r) for r in sorted(skipped)) or None)) repo.ui.debug('rebase status resumed\n') - _setrebasesetvisibility(repo, state.keys()) + _setrebasesetvisibility(repo, set(state.keys()) | set([originalwd])) self.originalwd = originalwd self.target = target @@ -251,7 +283,7 @@ def _prepareabortorcontinue(self, isabort): try: self.restorestatus() - self.collapsemsg = restorecollapsemsg(self.repo) + self.collapsemsg = restorecollapsemsg(self.repo, isabort) except error.RepoLookupError: if isabort: clearstatus(self.repo) @@ -294,11 +326,11 @@ self.ui.status(_('nothing to rebase\n')) return _nothingtorebase() - root = min(rebaseset) - if not self.keepf and not self.repo[root].mutable(): - raise error.Abort(_("can't rebase public changeset %s") - % self.repo[root], - hint=_("see 'hg help phases' for details")) + for root in self.repo.set('roots(%ld)', rebaseset): + if not self.keepf and not root.mutable(): + raise error.Abort(_("can't rebase public changeset %s") + % root, + hint=_("see 'hg help phases' for details")) (self.originalwd, self.target, self.state) = result if self.collapsef: @@ -311,7 +343,7 @@ if dest.closesbranch() and not self.keepbranchesf: self.ui.status(_('reopening closed branch head %s\n') % dest) - def _performrebase(self): + def _performrebase(self, tr): repo, ui, opts = self.repo, self.ui, self.opts if self.keepbranchesf: # insert _savebranch at the start of extrafns so if @@ -337,6 +369,10 @@ if self.activebookmark: bookmarks.deactivate(repo) + # Store the state before we begin so users can run 'hg rebase --abort' + # if we fail before the transaction closes. + self.storestatus() + sortedrevs = repo.revs('sort(%ld, -topo)', self.state) cands = [k for k, v in self.state.iteritems() if v == revtodo] total = len(cands) @@ -357,10 +393,7 @@ self.state, self.targetancestors, self.obsoletenotrebased) - storestatus(repo, self.originalwd, self.target, - self.state, self.collapsef, self.keepf, - self.keepbranchesf, self.external, - self.activebookmark) + self.storestatus(tr=tr) storecollapsemsg(repo, self.collapsemsg) if len(repo[None].parents()) == 2: repo.ui.debug('resuming interrupted rebase\n') @@ -442,12 +475,24 @@ editopt = True editor = cmdutil.getcommiteditor(edit=editopt, editform=editform) revtoreuse = max(self.state) - newnode = concludenode(repo, revtoreuse, p1, self.external, - commitmsg=commitmsg, - extrafn=_makeextrafn(self.extrafns), - editor=editor, - keepbranches=self.keepbranchesf, - date=self.date) + dsguard = dirstateguard.dirstateguard(repo, 'rebase') + try: + newnode = concludenode(repo, revtoreuse, p1, self.external, + commitmsg=commitmsg, + extrafn=_makeextrafn(self.extrafns), + editor=editor, + keepbranches=self.keepbranchesf, + date=self.date) + dsguard.close() + release(dsguard) + except error.InterventionRequired: + dsguard.close() + release(dsguard) + raise + except Exception: + release(dsguard) + raise + if newnode is None: newrev = self.target else: @@ -617,6 +662,16 @@ hg rebase -r "branch(featureX)" -d 1.3 --keepbranches + Configuration Options: + + You can make rebase require a destination if you set the following config + option: + + [commands] + rebase.requiredest = False + + Return Values: + Returns 0 on success, 1 if nothing to rebase or there are unresolved conflicts. @@ -630,6 +685,12 @@ # Validate input and define rebasing points destf = opts.get('dest', None) + + if ui.config('commands', 'rebase.requiredest'): + if not destf: + raise error.Abort(_('you must specify a destination'), + hint=_('use: hg rebase -d REV')) + srcf = opts.get('source', None) basef = opts.get('base', None) revf = opts.get('rev', []) @@ -678,15 +739,31 @@ if retcode is not None: return retcode - rbsrt._performrebase() + with repo.transaction('rebase') as tr: + dsguard = dirstateguard.dirstateguard(repo, 'rebase') + try: + rbsrt._performrebase(tr) + dsguard.close() + release(dsguard) + except error.InterventionRequired: + dsguard.close() + release(dsguard) + tr.close() + raise + except Exception: + release(dsguard) + raise rbsrt._finishrebase() finally: release(lock, wlock) -def _definesets(ui, repo, destf=None, srcf=None, basef=None, revf=[], +def _definesets(ui, repo, destf=None, srcf=None, basef=None, revf=None, destspace=None): """use revisions argument to define destination and rebase set """ + if revf is None: + revf = [] + # destspace is here to work around issues with `hg pull --rebase` see # issue5214 for details if srcf and basef: @@ -799,36 +876,28 @@ '''Commit the wd changes with parents p1 and p2. Reuse commit info from rev but also store useful information in extra. Return node of committed revision.''' - dsguard = dirstateguard.dirstateguard(repo, 'rebase') - try: - repo.setparents(repo[p1].node(), repo[p2].node()) - ctx = repo[rev] - if commitmsg is None: - commitmsg = ctx.description() - keepbranch = keepbranches and repo[p1].branch() != ctx.branch() - extra = {'rebase_source': ctx.hex()} - if extrafn: - extrafn(ctx, extra) + repo.setparents(repo[p1].node(), repo[p2].node()) + ctx = repo[rev] + if commitmsg is None: + commitmsg = ctx.description() + keepbranch = keepbranches and repo[p1].branch() != ctx.branch() + extra = {'rebase_source': ctx.hex()} + if extrafn: + extrafn(ctx, extra) - backup = repo.ui.backupconfig('phases', 'new-commit') - try: - targetphase = max(ctx.phase(), phases.draft) - repo.ui.setconfig('phases', 'new-commit', targetphase, 'rebase') - if keepbranch: - repo.ui.setconfig('ui', 'allowemptycommit', True) - # Commit might fail if unresolved files exist - if date is None: - date = ctx.date() - newnode = repo.commit(text=commitmsg, user=ctx.user(), - date=date, extra=extra, editor=editor) - finally: - repo.ui.restoreconfig(backup) + targetphase = max(ctx.phase(), phases.draft) + overrides = {('phases', 'new-commit'): targetphase} + with repo.ui.configoverride(overrides, 'rebase'): + if keepbranch: + repo.ui.setconfig('ui', 'allowemptycommit', True) + # Commit might fail if unresolved files exist + if date is None: + date = ctx.date() + newnode = repo.commit(text=commitmsg, user=ctx.user(), + date=date, extra=extra, editor=editor) - repo.dirstate.setbranch(repo[newnode].branch()) - dsguard.close() - return newnode - finally: - release(dsguard) + repo.dirstate.setbranch(repo[newnode].branch()) + return newnode def rebasenode(repo, rev, p1, base, state, collapse, target): 'Rebase a single revision rev on top of p1 using base as merge ancestor' @@ -1061,9 +1130,9 @@ def clearcollapsemsg(repo): 'Remove collapse message file' - util.unlinkpath(repo.join("last-message.txt"), ignoremissing=True) + repo.vfs.unlinkpath("last-message.txt", ignoremissing=True) -def restorecollapsemsg(repo): +def restorecollapsemsg(repo, isabort): 'Restore previously stored collapse message' try: f = repo.vfs("last-message.txt") @@ -1072,38 +1141,17 @@ except IOError as err: if err.errno != errno.ENOENT: raise - raise error.Abort(_('no rebase in progress')) + if isabort: + # Oh well, just abort like normal + collapsemsg = '' + else: + raise error.Abort(_('missing .hg/last-message.txt for rebase')) return collapsemsg -def storestatus(repo, originalwd, target, state, collapse, keep, keepbranches, - external, activebookmark): - 'Store the current status to allow recovery' - f = repo.vfs("rebasestate", "w") - f.write(repo[originalwd].hex() + '\n') - f.write(repo[target].hex() + '\n') - f.write(repo[external].hex() + '\n') - f.write('%d\n' % int(collapse)) - f.write('%d\n' % int(keep)) - f.write('%d\n' % int(keepbranches)) - f.write('%s\n' % (activebookmark or '')) - for d, v in state.iteritems(): - oldrev = repo[d].hex() - if v >= 0: - newrev = repo[v].hex() - elif v == revtodo: - # To maintain format compatibility, we have to use nullid. - # Please do remove this special case when upgrading the format. - newrev = hex(nullid) - else: - newrev = v - f.write("%s:%s\n" % (oldrev, newrev)) - f.close() - repo.ui.debug('rebase status stored\n') - def clearstatus(repo): 'Remove the status files' _clearrebasesetvisibiliy(repo) - util.unlinkpath(repo.join("rebasestate"), ignoremissing=True) + repo.vfs.unlinkpath("rebasestate", ignoremissing=True) def needupdate(repo, state): '''check whether we should `update --clean` away from a merge, or if @@ -1155,8 +1203,11 @@ if rebased: strippoints = [ c.node() for c in repo.set('roots(%ld)', rebased)] - shouldupdate = len([ - c.node() for c in repo.set('. & (%ld)', rebased)]) > 0 + + updateifonnodes = set(rebased) + updateifonnodes.add(target) + updateifonnodes.add(originalwd) + shouldupdate = repo['.'].rev() in updateifonnodes # Update away from the rebase if necessary if shouldupdate or needupdate(repo, state): @@ -1183,7 +1234,8 @@ dest: context rebaseset: set of rev ''' - _setrebasesetvisibility(repo, rebaseset) + originalwd = repo['.'].rev() + _setrebasesetvisibility(repo, set(rebaseset) | set([originalwd])) # This check isn't strictly necessary, since mq detects commits over an # applied patch. But it prevents messing up the working directory when @@ -1203,7 +1255,12 @@ if commonbase == root: raise error.Abort(_('source is ancestor of destination')) if commonbase == dest: - samebranch = root.branch() == dest.branch() + wctx = repo[None] + if dest == wctx.p1(): + # when rebasing to '.', it will use the current wd branch name + samebranch = root.branch() == wctx.branch() + else: + samebranch = root.branch() == dest.branch() if not collapse and samebranch and root in dest.children(): repo.ui.debug('source is a child of destination\n') return None @@ -1268,7 +1325,7 @@ state[r] = revpruned else: state[r] = revprecursor - return repo['.'].rev(), dest.rev(), state + return originalwd, dest.rev(), state def clearrebased(ui, repo, state, skipped, collapsedas=None): """dispose of rebased revision at the end of the rebase @@ -1367,9 +1424,8 @@ """store the currently rebased set on the repo object This is used by another function to prevent rebased revision to because - hidden (see issue4505)""" + hidden (see issue4504)""" repo = repo.unfiltered() - revs = set(revs) repo._rebaseset = revs # invalidate cache if visibility changes hiddens = repo.filteredrevcache.get('visible', set()) @@ -1383,7 +1439,7 @@ del repo._rebaseset def _rebasedvisible(orig, repo): - """ensure rebased revs stay visible (see issue4505)""" + """ensure rebased revs stay visible (see issue4504)""" blockers = orig(repo) blockers.update(getattr(repo, '_rebaseset', ())) return blockers diff -r ed5b25874d99 -r 4baf79a77afa hgext/record.py --- a/hgext/record.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/record.py Fri Mar 24 08:37:26 2017 -0700 @@ -68,12 +68,9 @@ 'commit') opts["interactive"] = True - backup = ui.backupconfig('experimental', 'crecord') - try: - ui.setconfig('experimental', 'crecord', False, 'record') + overrides = {('experimental', 'crecord'): False} + with ui.configoverride(overrides, 'record'): return commands.commit(ui, repo, *pats, **opts) - finally: - ui.restoreconfig(backup) def qrefresh(origfn, ui, repo, *pats, **opts): if not opts['interactive']: @@ -117,13 +114,10 @@ opts['checkname'] = False mq.new(ui, repo, patch, *pats, **opts) - backup = ui.backupconfig('experimental', 'crecord') - try: - ui.setconfig('experimental', 'crecord', False, 'record') + overrides = {('experimental', 'crecord'): False} + with ui.configoverride(overrides, 'record'): cmdutil.dorecord(ui, repo, committomq, cmdsuggest, False, cmdutil.recordfilter, *pats, **opts) - finally: - ui.restoreconfig(backup) def qnew(origfn, ui, repo, patch, *args, **opts): if opts['interactive']: diff -r ed5b25874d99 -r 4baf79a77afa hgext/schemes.py --- a/hgext/schemes.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/schemes.py Fri Mar 24 08:37:26 2017 -0700 @@ -63,6 +63,7 @@ # leave the attribute unspecified. testedwith = 'ships-with-hg-core' +_partre = re.compile(br'\{(\d+)\}') class ShortRepository(object): def __init__(self, url, scheme, templater): @@ -70,7 +71,7 @@ self.templater = templater self.url = url try: - self.parts = max(map(int, re.findall(r'\{(\d+)\}', self.url))) + self.parts = max(map(int, _partre.findall(self.url))) except ValueError: self.parts = 0 diff -r ed5b25874d99 -r 4baf79a77afa hgext/share.py --- a/hgext/share.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/share.py Fri Mar 24 08:37:26 2017 -0700 @@ -48,6 +48,7 @@ error, extensions, hg, + txnutil, util, ) @@ -64,10 +65,14 @@ @command('share', [('U', 'noupdate', None, _('do not create a working directory')), - ('B', 'bookmarks', None, _('also share bookmarks'))], + ('B', 'bookmarks', None, _('also share bookmarks')), + ('', 'relative', None, _('point to source using a relative path ' + '(EXPERIMENTAL)')), + ], _('[-U] [-B] SOURCE [DEST]'), norepo=True) -def share(ui, source, dest=None, noupdate=False, bookmarks=False): +def share(ui, source, dest=None, noupdate=False, bookmarks=False, + relative=False): """create a new shared repository Initialize a new repository and working directory that shares its @@ -86,7 +91,7 @@ """ return hg.share(ui, source, dest=dest, update=not noupdate, - bookmarks=bookmarks) + bookmarks=bookmarks, relative=relative) @command('unshare', [], '') def unshare(ui, repo): @@ -108,10 +113,11 @@ destlock = hg.copystore(ui, repo, repo.path) - sharefile = repo.join('sharedpath') + sharefile = repo.vfs.join('sharedpath') util.rename(sharefile, sharefile + '.old') - repo.requirements.discard('sharedpath') + repo.requirements.discard('shared') + repo.requirements.discard('relshared') repo._writerequirements() finally: destlock and destlock.release() @@ -171,7 +177,28 @@ if _hassharedbookmarks(repo): srcrepo = _getsrcrepo(repo) if srcrepo is not None: + # just orig(srcrepo) doesn't work as expected, because + # HG_PENDING refers repo.root. + try: + fp, pending = txnutil.trypending(repo.root, repo.vfs, + 'bookmarks') + if pending: + # only in this case, bookmark information in repo + # is up-to-date. + return fp + fp.close() + except IOError as inst: + if inst.errno != errno.ENOENT: + raise + + # otherwise, we should read bookmarks from srcrepo, + # because .hg/bookmarks in srcrepo might be already + # changed via another sharing repo repo = srcrepo + + # TODO: Pending changes in repo are still invisible in + # srcrepo, because bookmarks.pending is written only into repo. + # See also https://www.mercurial-scm.org/wiki/SharedRepository return orig(repo) def recordchange(orig, self, tr): diff -r ed5b25874d99 -r 4baf79a77afa hgext/shelve.py --- a/hgext/shelve.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/shelve.py Fri Mar 24 08:37:26 2017 -0700 @@ -46,6 +46,7 @@ scmutil, templatefilters, util, + vfs as vfsmod, ) from . import ( @@ -62,7 +63,7 @@ backupdir = 'shelve-backup' shelvedir = 'shelved' -shelvefileextensions = ['hg', 'patch'] +shelvefileextensions = ['hg', 'patch', 'oshelve'] # universal extension is present in all types of shelves patchextension = 'patch' @@ -78,8 +79,8 @@ def __init__(self, repo, name, filetype=None): self.repo = repo self.name = name - self.vfs = scmutil.vfs(repo.join(shelvedir)) - self.backupvfs = scmutil.vfs(repo.join(backupdir)) + self.vfs = vfsmod.vfs(repo.vfs.join(shelvedir)) + self.backupvfs = vfsmod.vfs(repo.vfs.join(backupdir)) self.ui = self.repo.ui if filetype: self.fname = name + '.' + filetype @@ -153,6 +154,12 @@ bundle2.writebundle(self.ui, cg, self.fname, btype, self.vfs, compression=compression) + def writeobsshelveinfo(self, info): + scmutil.simplekeyvaluefile(self.vfs, self.fname).write(info) + + def readobsshelveinfo(self): + return scmutil.simplekeyvaluefile(self.vfs, self.fname).read() + class shelvedstate(object): """Handle persistence during unshelving operations. @@ -177,7 +184,7 @@ wctx = nodemod.bin(fp.readline().strip()) pendingctx = nodemod.bin(fp.readline().strip()) parents = [nodemod.bin(h) for h in fp.readline().split()] - stripnodes = [nodemod.bin(h) for h in fp.readline().split()] + nodestoprune = [nodemod.bin(h) for h in fp.readline().split()] branchtorestore = fp.readline().strip() keep = fp.readline().strip() == cls._keep except (ValueError, TypeError) as err: @@ -191,7 +198,7 @@ obj.wctx = repo[wctx] obj.pendingctx = repo[pendingctx] obj.parents = parents - obj.stripnodes = stripnodes + obj.nodestoprune = nodestoprune obj.branchtorestore = branchtorestore obj.keep = keep except error.RepoLookupError as err: @@ -200,7 +207,7 @@ return obj @classmethod - def save(cls, repo, name, originalwctx, pendingctx, stripnodes, + def save(cls, repo, name, originalwctx, pendingctx, nodestoprune, branchtorestore, keep=False): fp = repo.vfs(cls._filename, 'wb') fp.write('%i\n' % cls._version) @@ -210,17 +217,17 @@ fp.write('%s\n' % ' '.join([nodemod.hex(p) for p in repo.dirstate.parents()])) fp.write('%s\n' % - ' '.join([nodemod.hex(n) for n in stripnodes])) + ' '.join([nodemod.hex(n) for n in nodestoprune])) fp.write('%s\n' % branchtorestore) fp.write('%s\n' % (cls._keep if keep else cls._nokeep)) fp.close() @classmethod def clear(cls, repo): - util.unlinkpath(repo.join(cls._filename), ignoremissing=True) + repo.vfs.unlinkpath(cls._filename, ignoremissing=True) def cleanupoldbackups(repo): - vfs = scmutil.vfs(repo.join(backupdir)) + vfs = vfsmod.vfs(repo.vfs.join(backupdir)) maxbackups = repo.ui.configint('shelve', 'maxbackups', 10) hgfiles = [f for f in vfs.listdir() if f.endswith('.' + patchextension)] @@ -235,11 +242,7 @@ continue base = f[:-(1 + len(patchextension))] for ext in shelvefileextensions: - try: - vfs.unlink(base + '.' + ext) - except OSError as err: - if err.errno != errno.ENOENT: - raise + vfs.tryunlink(base + '.' + ext) def _aborttransaction(repo): '''Abort current transaction for shelve/unshelve, but keep dirstate @@ -313,17 +316,16 @@ hasmq = util.safehasattr(repo, 'mq') if hasmq: saved, repo.mq.checkapplied = repo.mq.checkapplied, False - backup = repo.ui.backupconfig('phases', 'new-commit') + overrides = {('phases', 'new-commit'): phases.secret} try: - repo.ui.setconfig('phases', 'new-commit', phases.secret) editor_ = False if editor: editor_ = cmdutil.getcommiteditor(editform='shelve.shelve', **opts) - return repo.commit(message, shelveuser, opts.get('date'), match, - editor=editor_, extra=extra) + with repo.ui.configoverride(overrides): + return repo.commit(message, shelveuser, opts.get('date'), + match, editor=editor_, extra=extra) finally: - repo.ui.restoreconfig(backup) if hasmq: repo.mq.checkapplied = saved @@ -485,6 +487,7 @@ if not ui.plain(): width = ui.termwidth() namelabel = 'shelve.newest' + ui.pager('shelve') for mtime, name in listshelves(repo): sname = util.split(name)[1] if pats and sname not in pats: @@ -549,19 +552,17 @@ try: checkparents(repo, state) - util.rename(repo.join('unshelverebasestate'), - repo.join('rebasestate')) + repo.vfs.rename('unshelverebasestate', 'rebasestate') try: rebase.rebase(ui, repo, **{ 'abort' : True }) except Exception: - util.rename(repo.join('rebasestate'), - repo.join('unshelverebasestate')) + repo.vfs.rename('rebasestate', 'unshelverebasestate') raise mergefiles(ui, repo, state.wctx, state.pendingctx) - repair.strip(ui, repo, state.stripnodes, backup=False, + repair.strip(ui, repo, state.nodestoprune, backup=False, topic='shelve') finally: shelvedstate.clear(repo) @@ -617,15 +618,13 @@ _("unresolved conflicts, can't continue"), hint=_("see 'hg resolve', then 'hg unshelve --continue'")) - util.rename(repo.join('unshelverebasestate'), - repo.join('rebasestate')) + repo.vfs.rename('unshelverebasestate', 'rebasestate') try: rebase.rebase(ui, repo, **{ 'continue' : True }) except Exception: - util.rename(repo.join('rebasestate'), - repo.join('unshelverebasestate')) + repo.vfs.rename('rebasestate', 'unshelverebasestate') raise shelvectx = repo['tip'] @@ -634,12 +633,12 @@ shelvectx = state.pendingctx else: # only strip the shelvectx if the rebase produced it - state.stripnodes.append(shelvectx.node()) + state.nodestoprune.append(shelvectx.node()) mergefiles(ui, repo, state.wctx, shelvectx) restorebranch(ui, repo, state.branchtorestore) - repair.strip(ui, repo, state.stripnodes, backup=False, topic='shelve') + repair.strip(ui, repo, state.nodestoprune, backup=False, topic='shelve') shelvedstate.clear(repo) unshelvecleanup(ui, repo, state.name, opts) ui.status(_("unshelve of '%s' complete\n") % state.name) @@ -691,13 +690,12 @@ except error.InterventionRequired: tr.close() - stripnodes = [repo.changelog.node(rev) - for rev in xrange(oldtiprev, len(repo))] - shelvedstate.save(repo, basename, pctx, tmpwctx, stripnodes, + nodestoprune = [repo.changelog.node(rev) + for rev in xrange(oldtiprev, len(repo))] + shelvedstate.save(repo, basename, pctx, tmpwctx, nodestoprune, branchtorestore, opts.get('keep')) - util.rename(repo.join('rebasestate'), - repo.join('unshelverebasestate')) + repo.vfs.rename('rebasestate', 'unshelverebasestate') raise error.InterventionRequired( _("unresolved conflicts (see 'hg resolve', then " "'hg unshelve --continue')")) @@ -747,10 +745,12 @@ _('continue an incomplete unshelve operation')), ('k', 'keep', None, _('keep shelve after unshelving')), + ('n', 'name', '', + _('restore shelved change with given name'), _('NAME')), ('t', 'tool', '', _('specify merge tool')), ('', 'date', '', _('set date for temporary commits (DEPRECATED)'), _('DATE'))], - _('hg unshelve [SHELVED]')) + _('hg unshelve [[-n] SHELVED]')) def unshelve(ui, repo, *shelved, **opts): """restore a shelved change to the working directory @@ -795,6 +795,9 @@ continuef = opts.get('continue') if not abortf and not continuef: cmdutil.checkunfinished(repo) + shelved = list(shelved) + if opts.get("name"): + shelved.append(opts["name"]) if abortf or continuef: if abortf and continuef: @@ -848,9 +851,7 @@ oldquiet = ui.quiet lock = tr = None - forcemerge = ui.backupconfig('ui', 'forcemerge') try: - ui.setconfig('ui', 'forcemerge', opts.get('tool', ''), 'unshelve') lock = repo.lock() tr = repo.transaction('unshelve', report=lambda x: None) @@ -864,31 +865,33 @@ # and shelvectx is the unshelved changes. Then we merge it all down # to the original pctx. - tmpwctx, addedbefore = _commitworkingcopychanges(ui, repo, opts, - tmpwctx) - - repo, shelvectx = _unshelverestorecommit(ui, repo, basename, oldquiet) - _checkunshelveuntrackedproblems(ui, repo, shelvectx) - branchtorestore = '' - if shelvectx.branch() != shelvectx.p1().branch(): - branchtorestore = shelvectx.branch() + overrides = {('ui', 'forcemerge'): opts.get('tool', '')} + with ui.configoverride(overrides, 'unshelve'): + tmpwctx, addedbefore = _commitworkingcopychanges(ui, repo, opts, + tmpwctx) - shelvectx = _rebaserestoredcommit(ui, repo, opts, tr, oldtiprev, - basename, pctx, tmpwctx, shelvectx, - branchtorestore) - mergefiles(ui, repo, pctx, shelvectx) - restorebranch(ui, repo, branchtorestore) - _forgetunknownfiles(repo, shelvectx, addedbefore) + repo, shelvectx = _unshelverestorecommit(ui, repo, basename, + oldquiet) + _checkunshelveuntrackedproblems(ui, repo, shelvectx) + branchtorestore = '' + if shelvectx.branch() != shelvectx.p1().branch(): + branchtorestore = shelvectx.branch() - shelvedstate.clear(repo) - _finishunshelve(repo, oldtiprev, tr) - unshelvecleanup(ui, repo, basename, opts) + shelvectx = _rebaserestoredcommit(ui, repo, opts, tr, oldtiprev, + basename, pctx, tmpwctx, + shelvectx, branchtorestore) + mergefiles(ui, repo, pctx, shelvectx) + restorebranch(ui, repo, branchtorestore) + _forgetunknownfiles(repo, shelvectx, addedbefore) + + shelvedstate.clear(repo) + _finishunshelve(repo, oldtiprev, tr) + unshelvecleanup(ui, repo, basename, opts) finally: ui.quiet = oldquiet if tr: tr.release() lockmod.release(lock) - ui.restoreconfig(forcemerge) @command('shelve', [('A', 'addremove', None, diff -r ed5b25874d99 -r 4baf79a77afa hgext/transplant.py --- a/hgext/transplant.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/transplant.py Fri Mar 24 08:37:26 2017 -0700 @@ -28,11 +28,14 @@ merge, node as nodemod, patch, + pycompat, registrar, revlog, revset, scmutil, + smartset, util, + vfs as vfsmod, ) class TransplantError(error.Abort): @@ -58,7 +61,7 @@ self.opener = opener if not opener: - self.opener = scmutil.opener(self.path) + self.opener = vfsmod.vfs(self.path) self.transplants = {} self.dirty = False self.read() @@ -100,8 +103,8 @@ class transplanter(object): def __init__(self, ui, repo, opts): self.ui = ui - self.path = repo.join('transplant') - self.opener = scmutil.opener(self.path) + self.path = repo.vfs.join('transplant') + self.opener = vfsmod.vfs(self.path) self.transplants = transplants(self.path, 'transplants', opener=self.opener) def getcommiteditor(): @@ -197,7 +200,7 @@ patchfile = None else: fd, patchfile = tempfile.mkstemp(prefix='hg-transplant-') - fp = os.fdopen(fd, 'w') + fp = os.fdopen(fd, pycompat.sysstr('w')) gen = patch.diff(source, parent, node, opts=diffopts) for chunk in gen: fp.write(chunk) @@ -245,7 +248,7 @@ self.ui.status(_('filtering %s\n') % patchfile) user, date, msg = (changelog[1], changelog[2], changelog[4]) fd, headerfile = tempfile.mkstemp(prefix='hg-transplant-') - fp = os.fdopen(fd, 'w') + fp = os.fdopen(fd, pycompat.sysstr('w')) fp.write("# HG changeset patch\n") fp.write("# User %s\n" % user) fp.write("# Date %d %d\n" % date) @@ -258,7 +261,8 @@ environ={'HGUSER': changelog[1], 'HGREVISION': nodemod.hex(node), }, - onerr=error.Abort, errprefix=_('filter failed')) + onerr=error.Abort, errprefix=_('filter failed'), + blockedtag='transplant_filter') user, date, msg = self.parselog(file(headerfile))[1:4] finally: os.unlink(headerfile) @@ -722,7 +726,7 @@ s = revset.getset(repo, subset, x) else: s = subset - return revset.baseset([r for r in s if + return smartset.baseset([r for r in s if repo[r].extra().get('transplant_source')]) templatekeyword = registrar.templatekeyword() diff -r ed5b25874d99 -r 4baf79a77afa hgext/win32text.py --- a/hgext/win32text.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/win32text.py Fri Mar 24 08:37:26 2017 -0700 @@ -74,7 +74,7 @@ 'and does not need EOL conversion by the win32text plugin.\n' 'Before your next commit, please reconsider your ' 'encode/decode settings in \nMercurial.ini or %s.\n') % - (filename, newlinestr[newline], repo.join('hgrc'))) + (filename, newlinestr[newline], repo.vfs.join('hgrc'))) def dumbdecode(s, cmd, **kwargs): checknewline(s, '\r\n', **kwargs) diff -r ed5b25874d99 -r 4baf79a77afa hgext/zeroconf/__init__.py --- a/hgext/zeroconf/__init__.py Thu Mar 23 19:54:59 2017 -0700 +++ b/hgext/zeroconf/__init__.py Fri Mar 24 08:37:26 2017 -0700 @@ -64,7 +64,9 @@ # Generic method, sometimes gives useless results try: dumbip = socket.gethostbyaddr(socket.gethostname())[2][0] - if not dumbip.startswith('127.') and ':' not in dumbip: + if ':' in dumbip: + dumbip = '127.0.0.1' + if not dumbip.startswith('127.'): return dumbip except (socket.gaierror, socket.herror): dumbip = '127.0.0.1' diff -r ed5b25874d99 -r 4baf79a77afa mercurial/__init__.py --- a/mercurial/__init__.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/__init__.py Fri Mar 24 08:37:26 2017 -0700 @@ -68,7 +68,7 @@ # indicates the type of module. So just assume what we found # is OK (even though it could be a pure Python module). except ImportError: - if modulepolicy == 'c': + if modulepolicy == b'c': raise zl = ziploader('mercurial', 'pure') mod = zl.load_module(name) @@ -106,7 +106,7 @@ 'version should exist' % name) except ImportError: - if modulepolicy == 'c': + if modulepolicy == b'c': raise # Could not load the C extension and pure Python is allowed. So @@ -137,6 +137,9 @@ # Only handle Mercurial-related modules. if not fullname.startswith(('mercurial.', 'hgext.', 'hgext3rd.')): return None + # zstd is already dual-version clean, don't try and mangle it + if fullname.startswith('mercurial.zstd'): + return None # This assumes Python 3 doesn't support loading C modules. if fullname in _dualmodules: @@ -280,7 +283,7 @@ continue r, c = t.start l = (b'; from mercurial.pycompat import ' - b'delattr, getattr, hasattr, setattr, xrange\n') + b'delattr, getattr, hasattr, setattr, xrange, open\n') for u in tokenize.tokenize(io.BytesIO(l).readline): if u.type in (tokenize.ENCODING, token.ENDMARKER): continue @@ -307,17 +310,10 @@ if argidx is not None: _ensureunicode(argidx) - # Bare open call (not an attribute on something else), the - # second argument (mode) must be a string, not bytes - elif fn == 'open' and not _isop(i - 1, '.'): - arg1idx = _findargnofcall(1) - if arg1idx is not None: - _ensureunicode(arg1idx) - - # It changes iteritems to items as iteritems is not + # It changes iteritems/values to items/values as they are not # present in Python 3 world. - elif fn == 'iteritems': - yield t._replace(string='items') + elif fn in ('iteritems', 'itervalues'): + yield t._replace(string=fn[4:]) continue # Emit unmodified token. @@ -327,7 +323,7 @@ # ``replacetoken`` or any mechanism that changes semantics of module # loading is changed. Otherwise cached bytecode may get loaded without # the new transformation mechanisms applied. - BYTECODEHEADER = b'HG\x00\x06' + BYTECODEHEADER = b'HG\x00\x09' class hgloader(importlib.machinery.SourceFileLoader): """Custom module loader that transforms source code. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/ancestor.py --- a/mercurial/ancestor.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/ancestor.py Fri Mar 24 08:37:26 2017 -0700 @@ -296,6 +296,8 @@ except StopIteration: return False + __bool__ = __nonzero__ + def __iter__(self): """Generate the ancestors of _initrevs in reverse topological order. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/archival.py --- a/mercurial/archival.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/archival.py Fri Mar 24 08:37:26 2017 -0700 @@ -22,8 +22,8 @@ encoding, error, match as matchmod, - scmutil, util, + vfs as vfsmod, ) stringio = util.stringio @@ -249,7 +249,7 @@ def __init__(self, name, mtime): self.basedir = name - self.opener = scmutil.opener(self.basedir) + self.opener = vfsmod.vfs(self.basedir) def addfile(self, name, mode, islink, data): if islink: @@ -331,7 +331,7 @@ for subpath in sorted(ctx.substate): sub = ctx.workingsub(subpath) submatch = matchmod.subdirmatcher(subpath, matchfn) - total += sub.archive(archiver, prefix, submatch) + total += sub.archive(archiver, prefix, submatch, decode) if total == 0: raise error.Abort(_('no files match the archive pattern')) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/bdiff_module.c --- a/mercurial/bdiff_module.c Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/bdiff_module.c Fri Mar 24 08:37:26 2017 -0700 @@ -158,7 +158,7 @@ r = PyBytes_AsString(s); rlen = PyBytes_Size(s); - w = (char *)malloc(rlen ? rlen : 1); + w = (char *)PyMem_Malloc(rlen ? rlen : 1); if (!w) goto nomem; @@ -178,7 +178,7 @@ result = PyBytes_FromStringAndSize(w, wlen); nomem: - free(w); + PyMem_Free(w); return result ? result : PyErr_NoMemory(); } diff -r ed5b25874d99 -r 4baf79a77afa mercurial/bookmarks.py --- a/mercurial/bookmarks.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/bookmarks.py Fri Mar 24 08:37:26 2017 -0700 @@ -19,6 +19,7 @@ error, lock as lockmod, obsolete, + txnutil, util, ) @@ -29,17 +30,8 @@ bookmarks or the committed ones. Other extensions (like share) may need to tweak this behavior further. """ - bkfile = None - if 'HG_PENDING' in encoding.environ: - try: - bkfile = repo.vfs('bookmarks.pending') - except IOError as inst: - if inst.errno != errno.ENOENT: - raise - if bkfile is None: - bkfile = repo.vfs('bookmarks') - return bkfile - + fp, pending = txnutil.trypending(repo.root, repo.vfs, 'bookmarks') + return fp class bmstore(dict): """Storage for bookmarks. @@ -139,11 +131,7 @@ finally: f.close() else: - try: - self._repo.vfs.unlink('bookmarks.current') - except OSError as inst: - if inst.errno != errno.ENOENT: - raise + self._repo.vfs.tryunlink('bookmarks.current') self._aclean = True def _write(self, fp): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/branchmap.py --- a/mercurial/branchmap.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/branchmap.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,9 +7,7 @@ from __future__ import absolute_import -import array import struct -import time from .node import ( bin, @@ -21,12 +19,12 @@ encoding, error, scmutil, + util, ) -array = array.array calcsize = struct.calcsize -pack = struct.pack -unpack = struct.unpack +pack_into = struct.pack_into +unpack_from = struct.unpack_from def _filename(repo): """name of a branchcache file for a given repo or repoview""" @@ -233,7 +231,7 @@ def write(self, repo): try: f = repo.vfs(_filename(repo), "w", atomictemp=True) - cachekey = [hex(self.tipnode), str(self.tiprev)] + cachekey = [hex(self.tipnode), '%d' % self.tiprev] if self.filteredhash is not None: cachekey.append(hex(self.filteredhash)) f.write(" ".join(cachekey) + '\n') @@ -261,7 +259,7 @@ missing heads, and a generator of nodes that are strictly a superset of heads missing, this function updates self to be correct. """ - starttime = time.time() + starttime = util.timer() cl = repo.changelog # collect new branch entries newbranches = {} @@ -314,7 +312,7 @@ self.tiprev = tiprev self.filteredhash = scmutil.filteredhash(repo, self.tiprev) - duration = time.time() - starttime + duration = util.timer() - starttime repo.ui.log('branchcache', 'updated %s branch cache in %.4f seconds\n', repo.filtername, duration) @@ -357,12 +355,14 @@ assert repo.filtername is None self._repo = repo self._names = [] # branch names in local encoding with static index - self._rbcrevs = array('c') # structs of type _rbcrecfmt + self._rbcrevs = bytearray() self._rbcsnameslen = 0 # length of names read at _rbcsnameslen try: bndata = repo.vfs.read(_rbcnames) self._rbcsnameslen = len(bndata) # for verification before writing - self._names = [encoding.tolocal(bn) for bn in bndata.split('\0')] + if bndata: + self._names = [encoding.tolocal(bn) + for bn in bndata.split('\0')] except (IOError, OSError): if readonly: # don't try to use cache - fall back to the slow path @@ -371,7 +371,7 @@ if self._names: try: data = repo.vfs.read(_rbcrevs) - self._rbcrevs.fromstring(data) + self._rbcrevs[:] = data except (IOError, OSError) as inst: repo.ui.debug("couldn't read revision branch cache: %s\n" % inst) @@ -390,8 +390,7 @@ self._rbcnamescount = 0 self._namesreverse.clear() self._rbcrevslen = len(self._repo.changelog) - self._rbcrevs = array('c') - self._rbcrevs.fromstring('\0' * (self._rbcrevslen * _rbcrecsize)) + self._rbcrevs = bytearray(self._rbcrevslen * _rbcrecsize) def branchinfo(self, rev): """Return branch name and close flag for rev, using and updating @@ -409,8 +408,8 @@ # fast path: extract data from cache, use it if node is matching reponode = changelog.node(rev)[:_rbcnodelen] - cachenode, branchidx = unpack( - _rbcrecfmt, buffer(self._rbcrevs, rbcrevidx, _rbcrecsize)) + cachenode, branchidx = unpack_from( + _rbcrecfmt, util.buffer(self._rbcrevs), rbcrevidx) close = bool(branchidx & _rbccloseflag) if close: branchidx &= _rbcbranchidxmask @@ -427,7 +426,7 @@ else: # rev/node map has changed, invalidate the cache from here up self._repo.ui.debug("history modification detected - truncating " - "revision branch cache to revision %s\n" % rev) + "revision branch cache to revision %d\n" % rev) truncate = rbcrevidx + _rbcrecsize del self._rbcrevs[truncate:] self._rbcrevslen = min(self._rbcrevslen, truncate) @@ -453,14 +452,14 @@ def _setcachedata(self, rev, node, branchidx): """Writes the node's branch data to the in-memory cache data.""" + if rev == nullrev: + return rbcrevidx = rev * _rbcrecsize - rec = array('c') - rec.fromstring(pack(_rbcrecfmt, node, branchidx)) if len(self._rbcrevs) < rbcrevidx + _rbcrecsize: self._rbcrevs.extend('\0' * (len(self._repo.changelog) * _rbcrecsize - len(self._rbcrevs))) - self._rbcrevs[rbcrevidx:rbcrevidx + _rbcrecsize] = rec + pack_into(_rbcrecfmt, self._rbcrevs, rbcrevidx, node, branchidx) self._rbcrevslen = min(self._rbcrevslen, rev) tr = self._repo.currenttransaction() @@ -504,7 +503,7 @@ len(self._rbcrevs) // _rbcrecsize) f = repo.vfs.open(_rbcrevs, 'ab') if f.tell() != start: - repo.ui.debug("truncating %s to %s\n" % (_rbcrevs, start)) + repo.ui.debug("truncating %s to %d\n" % (_rbcrevs, start)) f.seek(start) if f.tell() != start: start = 0 diff -r ed5b25874d99 -r 4baf79a77afa mercurial/bundle2.py --- a/mercurial/bundle2.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/bundle2.py Fri Mar 24 08:37:26 2017 -0700 @@ -271,6 +271,8 @@ def __nonzero__(self): return bool(self._sequences) + __bool__ = __nonzero__ + class bundleoperation(object): """an object that represents a single bundling process @@ -320,9 +322,6 @@ It iterates over each part then searches for and uses the proper handling code to process the part. Parts are processed in order. - This is very early version of this function that will be strongly reworked - before final usage. - Unknown Mandatory part will abort the process. It is temporarily possible to provide a prebuilt bundleoperation to the @@ -865,6 +864,11 @@ self._generated = None self.mandatory = mandatory + def __repr__(self): + cls = "%s.%s" % (self.__class__.__module__, self.__class__.__name__) + return ('<%s object at %x; id: %s; type: %s; mandatory: %s>' + % (cls, id(self), self.id, self.type, self.mandatory)) + def copy(self): """return a copy of the part diff -r ed5b25874d99 -r 4baf79a77afa mercurial/bundlerepo.py --- a/mercurial/bundlerepo.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/bundlerepo.py Fri Mar 24 08:37:26 2017 -0700 @@ -37,8 +37,8 @@ phases, pycompat, revlog, - scmutil, util, + vfs as vfsmod, ) class bundlerevlog(revlog.revlog): @@ -50,7 +50,7 @@ # # To differentiate a rev in the bundle from a rev in the revlog, we # check revision against repotiprev. - opener = scmutil.readonlyvfs(opener) + opener = vfsmod.readonlyvfs(opener) revlog.revlog.__init__(self, opener, indexfile) self.bundle = bundle n = len(self) @@ -209,7 +209,7 @@ node = self.node(node) if node in self.fulltextcache: - result = self.fulltextcache[node].tostring() + result = '%s' % self.fulltextcache[node] else: result = manifest.manifestrevlog.revision(self, nodeorrev) return result @@ -239,7 +239,7 @@ def __init__(self, *args, **kwargs): super(bundlephasecache, self).__init__(*args, **kwargs) if util.safehasattr(self, 'opener'): - self.opener = scmutil.readonlyvfs(self.opener) + self.opener = vfsmod.readonlyvfs(self.opener) def write(self): raise NotImplementedError @@ -272,7 +272,7 @@ suffix=".hg10un") self.tempfile = temp - with os.fdopen(fdtemp, 'wb') as fptemp: + with os.fdopen(fdtemp, pycompat.sysstr('wb')) as fptemp: fptemp.write(header) while True: chunk = read(2**18) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/changegroup.py --- a/mercurial/changegroup.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/changegroup.py Fri Mar 24 08:37:26 2017 -0700 @@ -26,6 +26,7 @@ error, mdiff, phases, + pycompat, util, ) @@ -98,7 +99,7 @@ fh = open(filename, "wb", 131072) else: fd, filename = tempfile.mkstemp(prefix="hg-bundle-", suffix=".hg") - fh = os.fdopen(fd, "wb") + fh = os.fdopen(fd, pycompat.sysstr("wb")) cleanup = filename for c in chunks: fh.write(c) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/changelog.py --- a/mercurial/changelog.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/changelog.py Fri Mar 24 08:37:26 2017 -0700 @@ -32,7 +32,7 @@ >>> s 'ab\\ncd\\\\\\\\n\\x00ab\\rcd\\\\\\n' >>> res = _string_escape(s) - >>> s == res.decode('string_escape') + >>> s == util.unescapestr(res) True """ # subset of the string_escape codec @@ -57,7 +57,7 @@ l = l.replace('\\\\', '\\\\\n') l = l.replace('\\0', '\0') l = l.replace('\n', '') - k, v = l.decode('string_escape').split(':', 1) + k, v = util.unescapestr(l).split(':', 1) extra[k] = v return extra diff -r ed5b25874d99 -r 4baf79a77afa mercurial/chgserver.py --- a/mercurial/chgserver.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/chgserver.py Fri Mar 24 08:37:26 2017 -0700 @@ -31,13 +31,15 @@ :: [chgserver] - idletimeout = 3600 # seconds, after which an idle server will exit - skiphash = False # whether to skip config or env change checks + # how long (in seconds) should an idle chg server exit + idletimeout = 3600 + + # whether to skip config or env change checks + skiphash = False """ from __future__ import absolute_import -import errno import hashlib import inspect import os @@ -176,26 +178,17 @@ else: self._csystem = csystem - def system(self, cmd, environ=None, cwd=None, onerr=None, - errprefix=None): + def _runsystem(self, cmd, environ, cwd, out): # fallback to the original system method if the output needs to be # captured (to self._buffers), or the output stream is not stdout # (e.g. stderr, cStringIO), because the chg client is not aware of # these situations and will behave differently (write to stdout). - if (any(s[1] for s in self._bufferstates) + if (out is not self.fout or not util.safehasattr(self.fout, 'fileno') or self.fout.fileno() != util.stdout.fileno()): - return super(chgui, self).system(cmd, environ, cwd, onerr, - errprefix) + return util.system(cmd, environ=environ, cwd=cwd, out=out) self.flush() - rc = self._csystem(cmd, util.shellenviron(environ), cwd) - if rc and onerr: - errmsg = '%s %s' % (os.path.basename(cmd.split(None, 1)[0]), - util.explainexit(rc)[0]) - if errprefix: - errmsg = '%s: %s' % (errprefix, errmsg) - raise onerr(errmsg) - return rc + return self._csystem(cmd, util.shellenviron(environ), cwd) def _runpager(self, cmd): self._csystem(cmd, util.shellenviron(), type='pager', @@ -287,9 +280,9 @@ _iochannels = [ # server.ch, ui.fp, mode - ('cin', 'fin', 'rb'), - ('cout', 'fout', 'wb'), - ('cerr', 'ferr', 'wb'), + ('cin', 'fin', pycompat.sysstr('rb')), + ('cout', 'fout', pycompat.sysstr('wb')), + ('cerr', 'ferr', pycompat.sysstr('wb')), ] class chgcmdserver(commandserver.server): @@ -549,11 +542,7 @@ # remove another server's socket file. but that's okay # since that server will detect and exit automatically and # the client will start a new server on demand. - try: - os.unlink(self._realaddress) - except OSError as exc: - if exc.errno != errno.ENOENT: - raise + util.tryunlink(self._realaddress) def printbanner(self, address): # no "listening at" message should be printed to simulate hg behavior diff -r ed5b25874d99 -r 4baf79a77afa mercurial/cmdutil.py --- a/mercurial/cmdutil.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/cmdutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -26,14 +26,12 @@ changelog, copies, crecord as crecordmod, - dirstateguard as dirstateguardmod, encoding, error, formatter, graphmod, lock as lockmod, match as matchmod, - mergeutil, obsolete, patch, pathutil, @@ -43,9 +41,11 @@ revlog, revset, scmutil, + smartset, templatekw, templater, util, + vfs as vfsmod, ) stringio = util.stringio @@ -201,7 +201,7 @@ newlyaddedandmodifiedfiles] backups = {} if tobackup: - backupdir = repo.join('record-backups') + backupdir = repo.vfs.join('record-backups') try: os.mkdir(backupdir) except OSError as err: @@ -584,7 +584,7 @@ raise error.CommandError(cmd, _('invalid arguments')) if not os.path.isfile(file_): raise error.Abort(_("revlog '%s' not found") % file_) - r = revlog.revlog(scmutil.opener(pycompat.getcwd(), audit=False), + r = revlog.revlog(vfsmod.vfs(pycompat.getcwd(), audit=False), file_[:-2] + ".i") return r @@ -728,7 +728,7 @@ dryrun=dryrun, cwd=cwd) if rename and not dryrun: if not after and srcexists and not samefile: - util.unlinkpath(repo.wjoin(abssrc)) + repo.wvfs.unlinkpath(abssrc) wctx.forget([abssrc]) # pat: ossep @@ -971,20 +971,18 @@ editor = None else: editor = getcommiteditor(editform=editform, **opts) - allowemptyback = repo.ui.backupconfig('ui', 'allowemptycommit') extra = {} for idfunc in extrapreimport: extrapreimportmap[idfunc](repo, extractdata, extra, opts) - try: - if partial: - repo.ui.setconfig('ui', 'allowemptycommit', True) + overrides = {} + if partial: + overrides[('ui', 'allowemptycommit')] = True + with repo.ui.configoverride(overrides, 'import'): n = repo.commit(message, user, date, match=m, editor=editor, extra=extra) for idfunc in extrapostimport: extrapostimportmap[idfunc](repo[n]) - finally: - repo.ui.restoreconfig(allowemptyback) else: if opts.get('exact') or importbranch: branch = branch or 'default' @@ -1300,7 +1298,7 @@ for key, value in sorted(extra.items()): # i18n: column positioning for "hg log" self.ui.write(_("extra: %s=%s\n") - % (key, value.encode('string_escape')), + % (key, util.escapestr(value)), label='ui.debug log.extra') description = ctx.description().strip() @@ -1443,24 +1441,13 @@ def __init__(self, ui, repo, matchfn, diffopts, tmpl, mapfile, buffered): changeset_printer.__init__(self, ui, repo, matchfn, diffopts, buffered) - formatnode = ui.debugflag and (lambda x: x) or (lambda x: x[:12]) - filters = {'formatnode': formatnode} - defaulttempl = { - 'parent': '{rev}:{node|formatnode} ', - 'manifest': '{rev}:{node|formatnode}', - 'file_copy': '{name} ({source})', - 'envvar': '{key}={value}', - 'extra': '{key}={value|stringescape}' - } - # filecopy is preserved for compatibility reasons - defaulttempl['filecopy'] = defaulttempl['file_copy'] assert not (tmpl and mapfile) + defaulttempl = templatekw.defaulttempl if mapfile: - self.t = templater.templater.frommapfile(mapfile, filters=filters, + self.t = templater.templater.frommapfile(mapfile, cache=defaulttempl) else: self.t = formatter.maketemplater(ui, 'changeset', tmpl, - filters=filters, cache=defaulttempl) self.cache = {} @@ -2092,11 +2079,11 @@ if opts.get('rev'): revs = scmutil.revrange(repo, opts['rev']) elif follow and repo.dirstate.p1() == nullid: - revs = revset.baseset() + revs = smartset.baseset() elif follow: revs = repo.revs('reverse(:.)') else: - revs = revset.spanset(repo) + revs = smartset.spanset(repo) revs.reverse() return revs @@ -2111,7 +2098,7 @@ limit = loglimit(opts) revs = _logrevs(repo, opts) if not revs: - return revset.baseset(), None, None + return smartset.baseset(), None, None expr, filematcher = _makelogrevset(repo, pats, opts, revs) if opts.get('rev'): # User-specified revs might be unsorted, but don't sort before @@ -2127,7 +2114,7 @@ if idx >= limit: break limitedrevs.append(rev) - revs = revset.baseset(limitedrevs) + revs = smartset.baseset(limitedrevs) return revs, expr, filematcher @@ -2142,7 +2129,7 @@ limit = loglimit(opts) revs = _logrevs(repo, opts) if not revs: - return revset.baseset([]), None, None + return smartset.baseset([]), None, None expr, filematcher = _makelogrevset(repo, pats, opts, revs) if expr: matcher = revset.match(repo.ui, expr, order=revset.followorder) @@ -2153,7 +2140,7 @@ if limit <= idx: break limitedrevs.append(r) - revs = revset.baseset(limitedrevs) + revs = smartset.baseset(limitedrevs) return revs, expr, filematcher @@ -2225,7 +2212,7 @@ graphmod.ascii(ui, state, type, char, lines, coldata) displayer.close() -def graphlog(ui, repo, *pats, **opts): +def graphlog(ui, repo, pats, opts): # Parameters are identical to log command ones revs, expr, filematcher = getgraphlogrevs(repo, pats, opts) revdag = graphmod.dagwalker(repo, revs) @@ -2236,6 +2223,8 @@ if opts.get('rev'): endrev = scmutil.revrange(repo, opts.get('rev')).max() + 1 getrenamed = templatekw.getrenamedfn(repo, endrev=endrev) + + ui.pager('log') displayer = show_changeset(ui, repo, opts, buffered=True) displaygraph(ui, repo, revdag, displayer, graphmod.asciiedges, getrenamed, filematcher) @@ -2483,7 +2472,7 @@ for f in list: if f in added: continue # we never unlink added files on remove - util.unlinkpath(repo.wjoin(f), ignoremissing=True) + repo.wvfs.unlinkpath(f, ignoremissing=True) repo[None].forget(list) if warn: @@ -2975,13 +2964,6 @@ clean = set(changes.clean) modadded = set() - # split between files known in target manifest and the others - smf = set(mf) - - # determine the exact nature of the deleted changesets - deladded = _deleted - smf - deleted = _deleted - deladded - # We need to account for the state of the file in the dirstate, # even when we revert against something else than parent. This will # slightly alter the behavior of revert (doing back up or not, delete @@ -3023,7 +3005,10 @@ # in case of merge, files that are actually added can be reported as # modified, we need to post process the result if p2 != nullid: - mergeadd = dsmodified - smf + mergeadd = set(dsmodified) + for path in dsmodified: + if path in mf: + mergeadd.remove(path) dsadded |= mergeadd dsmodified -= mergeadd @@ -3036,6 +3021,13 @@ dsremoved.add(src) names[src] = (repo.pathto(src, cwd), True) + # determine the exact nature of the deleted changesets + deladded = set(_deleted) + for path in _deleted: + if path in mf: + deladded.remove(path) + deleted = _deleted - deladded + # distinguish between file to forget and the other added = set() for abs in dsadded: @@ -3205,7 +3197,7 @@ def doremove(f): try: - util.unlinkpath(repo.wjoin(f)) + repo.wvfs.unlinkpath(f) except OSError: pass repo.dirstate.remove(f) @@ -3254,15 +3246,18 @@ diffopts = patch.difffeatureopts(repo.ui, whitespace=True) diffopts.nodates = True diffopts.git = True - reversehunks = repo.ui.configbool('experimental', - 'revertalternateinteractivemode', - True) + operation = 'discard' + reversehunks = True + if node != parent: + operation = 'revert' + reversehunks = repo.ui.configbool('experimental', + 'revertalternateinteractivemode', + True) if reversehunks: diff = patch.diff(repo, ctx.node(), None, m, opts=diffopts) else: diff = patch.diff(repo, None, ctx.node(), m, opts=diffopts) originalchunks = patch.parsepatch(diff) - operation = 'discard' if node == parent else 'revert' try: @@ -3366,11 +3361,6 @@ return cmd -def checkunresolved(ms): - ms._repo.ui.deprecwarn('checkunresolved moved from cmdutil to mergeutil', - '4.1') - return mergeutil.checkunresolved(ms) - # a list of (ui, repo, otherpeer, opts, missing) functions called by # commands.outgoing. "missing" is "missing" of the result of # "findcommonoutgoing()" @@ -3420,7 +3410,7 @@ raise error.Abort(msg, hint=hint) for f, clearable, allowcommit, msg, hint in unfinishedstates: if clearable and repo.vfs.exists(f): - util.unlink(repo.join(f)) + util.unlink(repo.vfs.join(f)) afterresolvedstates = [ ('graftstate', @@ -3477,10 +3467,3 @@ if after[1]: hint = after[0] raise error.Abort(_('no %s in progress') % task, hint=hint) - -class dirstateguard(dirstateguardmod.dirstateguard): - def __init__(self, repo, name): - dirstateguardmod.dirstateguard.__init__(self, repo, name) - repo.ui.deprecwarn( - 'dirstateguard has moved from cmdutil to dirstateguard', - '4.1') diff -r ed5b25874d99 -r 4baf79a77afa mercurial/color.py --- a/mercurial/color.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/color.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,59 +7,491 @@ from __future__ import absolute_import -_styles = {'grep.match': 'red bold', - 'grep.linenumber': 'green', - 'grep.rev': 'green', - 'grep.change': 'green', - 'grep.sep': 'cyan', - 'grep.filename': 'magenta', - 'grep.user': 'magenta', - 'grep.date': 'magenta', - 'bookmarks.active': 'green', - 'branches.active': 'none', - 'branches.closed': 'black bold', - 'branches.current': 'green', - 'branches.inactive': 'none', - 'diff.changed': 'white', - 'diff.deleted': 'red', - 'diff.diffline': 'bold', - 'diff.extended': 'cyan bold', - 'diff.file_a': 'red bold', - 'diff.file_b': 'green bold', - 'diff.hunk': 'magenta', - 'diff.inserted': 'green', - 'diff.tab': '', - 'diff.trailingwhitespace': 'bold red_background', - 'changeset.public' : '', - 'changeset.draft' : '', - 'changeset.secret' : '', - 'diffstat.deleted': 'red', - 'diffstat.inserted': 'green', - 'histedit.remaining': 'red bold', - 'ui.prompt': 'yellow', - 'log.changeset': 'yellow', - 'patchbomb.finalsummary': '', - 'patchbomb.from': 'magenta', - 'patchbomb.to': 'cyan', - 'patchbomb.subject': 'green', - 'patchbomb.diffstats': '', - 'rebase.rebased': 'blue', - 'rebase.remaining': 'red bold', - 'resolve.resolved': 'green bold', - 'resolve.unresolved': 'red bold', - 'shelve.age': 'cyan', - 'shelve.newest': 'green bold', - 'shelve.name': 'blue bold', - 'status.added': 'green bold', - 'status.clean': 'none', - 'status.copied': 'none', - 'status.deleted': 'cyan bold underline', - 'status.ignored': 'black bold', - 'status.modified': 'blue bold', - 'status.removed': 'red bold', - 'status.unknown': 'magenta bold underline', - 'tags.normal': 'green', - 'tags.local': 'black bold'} +import re + +from .i18n import _ + +from . import ( + encoding, + pycompat, + util +) + +try: + import curses + # Mapping from effect name to terminfo attribute name (or raw code) or + # color number. This will also force-load the curses module. + _baseterminfoparams = { + 'none': (True, 'sgr0', ''), + 'standout': (True, 'smso', ''), + 'underline': (True, 'smul', ''), + 'reverse': (True, 'rev', ''), + 'inverse': (True, 'rev', ''), + 'blink': (True, 'blink', ''), + 'dim': (True, 'dim', ''), + 'bold': (True, 'bold', ''), + 'invisible': (True, 'invis', ''), + 'italic': (True, 'sitm', ''), + 'black': (False, curses.COLOR_BLACK, ''), + 'red': (False, curses.COLOR_RED, ''), + 'green': (False, curses.COLOR_GREEN, ''), + 'yellow': (False, curses.COLOR_YELLOW, ''), + 'blue': (False, curses.COLOR_BLUE, ''), + 'magenta': (False, curses.COLOR_MAGENTA, ''), + 'cyan': (False, curses.COLOR_CYAN, ''), + 'white': (False, curses.COLOR_WHITE, ''), + } +except ImportError: + curses = None + _baseterminfoparams = {} + +# allow the extensions to change the default +_enabledbydefault = False + +# start and stop parameters for effects +_effects = { + 'none': 0, + 'black': 30, + 'red': 31, + 'green': 32, + 'yellow': 33, + 'blue': 34, + 'magenta': 35, + 'cyan': 36, + 'white': 37, + 'bold': 1, + 'italic': 3, + 'underline': 4, + 'inverse': 7, + 'dim': 2, + 'black_background': 40, + 'red_background': 41, + 'green_background': 42, + 'yellow_background': 43, + 'blue_background': 44, + 'purple_background': 45, + 'cyan_background': 46, + 'white_background': 47, + } + +_defaultstyles = { + 'grep.match': 'red bold', + 'grep.linenumber': 'green', + 'grep.rev': 'green', + 'grep.change': 'green', + 'grep.sep': 'cyan', + 'grep.filename': 'magenta', + 'grep.user': 'magenta', + 'grep.date': 'magenta', + 'bookmarks.active': 'green', + 'branches.active': 'none', + 'branches.closed': 'black bold', + 'branches.current': 'green', + 'branches.inactive': 'none', + 'diff.changed': 'white', + 'diff.deleted': 'red', + 'diff.diffline': 'bold', + 'diff.extended': 'cyan bold', + 'diff.file_a': 'red bold', + 'diff.file_b': 'green bold', + 'diff.hunk': 'magenta', + 'diff.inserted': 'green', + 'diff.tab': '', + 'diff.trailingwhitespace': 'bold red_background', + 'changeset.public' : '', + 'changeset.draft' : '', + 'changeset.secret' : '', + 'diffstat.deleted': 'red', + 'diffstat.inserted': 'green', + 'histedit.remaining': 'red bold', + 'ui.prompt': 'yellow', + 'log.changeset': 'yellow', + 'patchbomb.finalsummary': '', + 'patchbomb.from': 'magenta', + 'patchbomb.to': 'cyan', + 'patchbomb.subject': 'green', + 'patchbomb.diffstats': '', + 'rebase.rebased': 'blue', + 'rebase.remaining': 'red bold', + 'resolve.resolved': 'green bold', + 'resolve.unresolved': 'red bold', + 'shelve.age': 'cyan', + 'shelve.newest': 'green bold', + 'shelve.name': 'blue bold', + 'status.added': 'green bold', + 'status.clean': 'none', + 'status.copied': 'none', + 'status.deleted': 'cyan bold underline', + 'status.ignored': 'black bold', + 'status.modified': 'blue bold', + 'status.removed': 'red bold', + 'status.unknown': 'magenta bold underline', + 'tags.normal': 'green', + 'tags.local': 'black bold', +} def loadcolortable(ui, extname, colortable): - _styles.update(colortable) + _defaultstyles.update(colortable) + +def _terminfosetup(ui, mode): + '''Initialize terminfo data and the terminal if we're in terminfo mode.''' + + # If we failed to load curses, we go ahead and return. + if curses is None: + return + # Otherwise, see what the config file says. + if mode not in ('auto', 'terminfo'): + return + ui._terminfoparams.update(_baseterminfoparams) + + for key, val in ui.configitems('color'): + if key.startswith('color.'): + newval = (False, int(val), '') + ui._terminfoparams[key[6:]] = newval + elif key.startswith('terminfo.'): + newval = (True, '', val.replace('\\E', '\x1b')) + ui._terminfoparams[key[9:]] = newval + try: + curses.setupterm() + except curses.error as e: + ui._terminfoparams.clear() + return + + for key, (b, e, c) in ui._terminfoparams.items(): + if not b: + continue + if not c and not curses.tigetstr(e): + # Most terminals don't support dim, invis, etc, so don't be + # noisy and use ui.debug(). + ui.debug("no terminfo entry for %s\n" % e) + del ui._terminfoparams[key] + if not curses.tigetstr('setaf') or not curses.tigetstr('setab'): + # Only warn about missing terminfo entries if we explicitly asked for + # terminfo mode. + if mode == "terminfo": + ui.warn(_("no terminfo entry for setab/setaf: reverting to " + "ECMA-48 color\n")) + ui._terminfoparams.clear() + +def setup(ui): + """configure color on a ui + + That function both set the colormode for the ui object and read + the configuration looking for custom colors and effect definitions.""" + mode = _modesetup(ui) + ui._colormode = mode + if mode and mode != 'debug': + configstyles(ui) + +def _modesetup(ui): + if ui.plain(): + return None + default = 'never' + if _enabledbydefault: + default = 'auto' + config = ui.config('ui', 'color', default) + if config == 'debug': + return 'debug' + + auto = (config == 'auto') + always = not auto and util.parsebool(config) + if not always and not auto: + return None + + formatted = (always or (encoding.environ.get('TERM') != 'dumb' + and ui.formatted())) + + mode = ui.config('color', 'mode', 'auto') + + # If pager is active, color.pagermode overrides color.mode. + if getattr(ui, 'pageractive', False): + mode = ui.config('color', 'pagermode', mode) + + realmode = mode + if mode == 'auto': + if pycompat.osname == 'nt': + term = encoding.environ.get('TERM') + # TERM won't be defined in a vanilla cmd.exe environment. + + # UNIX-like environments on Windows such as Cygwin and MSYS will + # set TERM. They appear to make a best effort attempt at setting it + # to something appropriate. However, not all environments with TERM + # defined support ANSI. Since "ansi" could result in terminal + # gibberish, we error on the side of selecting "win32". However, if + # w32effects is not defined, we almost certainly don't support + # "win32", so don't even try. + if (term and 'xterm' in term) or not w32effects: + realmode = 'ansi' + else: + realmode = 'win32' + else: + realmode = 'ansi' + + def modewarn(): + # only warn if color.mode was explicitly set and we're in + # a formatted terminal + if mode == realmode and ui.formatted(): + ui.warn(_('warning: failed to set color mode to %s\n') % mode) + + if realmode == 'win32': + ui._terminfoparams.clear() + if not w32effects: + modewarn() + return None + _effects.update(w32effects) + elif realmode == 'ansi': + ui._terminfoparams.clear() + elif realmode == 'terminfo': + _terminfosetup(ui, mode) + if not ui._terminfoparams: + ## FIXME Shouldn't we return None in this case too? + modewarn() + realmode = 'ansi' + else: + return None + + if always or (auto and formatted): + return realmode + return None + +def configstyles(ui): + ui._styles.update(_defaultstyles) + for status, cfgeffects in ui.configitems('color'): + if '.' not in status or status.startswith(('color.', 'terminfo.')): + continue + cfgeffects = ui.configlist('color', status) + if cfgeffects: + good = [] + for e in cfgeffects: + if valideffect(ui, e): + good.append(e) + else: + ui.warn(_("ignoring unknown color/effect %r " + "(configured in color.%s)\n") + % (e, status)) + ui._styles[status] = ' '.join(good) + +def valideffect(ui, effect): + 'Determine if the effect is valid or not.' + return ((not ui._terminfoparams and effect in _effects) + or (effect in ui._terminfoparams + or effect[:-11] in ui._terminfoparams)) + +def _effect_str(ui, effect): + '''Helper function for render_effects().''' + + bg = False + if effect.endswith('_background'): + bg = True + effect = effect[:-11] + try: + attr, val, termcode = ui._terminfoparams[effect] + except KeyError: + return '' + if attr: + if termcode: + return termcode + else: + return curses.tigetstr(val) + elif bg: + return curses.tparm(curses.tigetstr('setab'), val) + else: + return curses.tparm(curses.tigetstr('setaf'), val) + +def _mergeeffects(text, start, stop): + """Insert start sequence at every occurrence of stop sequence + + >>> s = _mergeeffects('cyan', '[C]', '|') + >>> s = _mergeeffects(s + 'yellow', '[Y]', '|') + >>> s = _mergeeffects('ma' + s + 'genta', '[M]', '|') + >>> s = _mergeeffects('red' + s, '[R]', '|') + >>> s + '[R]red[M]ma[Y][C]cyan|[R][M][Y]yellow|[R][M]genta|' + """ + parts = [] + for t in text.split(stop): + if not t: + continue + parts.extend([start, t, stop]) + return ''.join(parts) + +def _render_effects(ui, text, effects): + 'Wrap text in commands to turn on each effect.' + if not text: + return text + if ui._terminfoparams: + start = ''.join(_effect_str(ui, effect) + for effect in ['none'] + effects.split()) + stop = _effect_str(ui, 'none') + else: + start = [str(_effects[e]) for e in ['none'] + effects.split()] + start = '\033[' + ';'.join(start) + 'm' + stop = '\033[' + str(_effects['none']) + 'm' + return _mergeeffects(text, start, stop) + +_ansieffectre = re.compile(br'\x1b\[[0-9;]*m') + +def stripeffects(text): + """Strip ANSI control codes which could be inserted by colorlabel()""" + return _ansieffectre.sub('', text) + +def colorlabel(ui, msg, label): + """add color control code according to the mode""" + if ui._colormode == 'debug': + if label and msg: + if msg[-1] == '\n': + msg = "[%s|%s]\n" % (label, msg[:-1]) + else: + msg = "[%s|%s]" % (label, msg) + elif ui._colormode is not None: + effects = [] + for l in label.split(): + s = ui._styles.get(l, '') + if s: + effects.append(s) + elif valideffect(ui, l): + effects.append(l) + effects = ' '.join(effects) + if effects: + msg = '\n'.join([_render_effects(ui, line, effects) + for line in msg.split('\n')]) + return msg + +w32effects = None +if pycompat.osname == 'nt': + import ctypes + + _kernel32 = ctypes.windll.kernel32 + + _WORD = ctypes.c_ushort + + _INVALID_HANDLE_VALUE = -1 + + class _COORD(ctypes.Structure): + _fields_ = [('X', ctypes.c_short), + ('Y', ctypes.c_short)] + + class _SMALL_RECT(ctypes.Structure): + _fields_ = [('Left', ctypes.c_short), + ('Top', ctypes.c_short), + ('Right', ctypes.c_short), + ('Bottom', ctypes.c_short)] + + class _CONSOLE_SCREEN_BUFFER_INFO(ctypes.Structure): + _fields_ = [('dwSize', _COORD), + ('dwCursorPosition', _COORD), + ('wAttributes', _WORD), + ('srWindow', _SMALL_RECT), + ('dwMaximumWindowSize', _COORD)] + + _STD_OUTPUT_HANDLE = 0xfffffff5 # (DWORD)-11 + _STD_ERROR_HANDLE = 0xfffffff4 # (DWORD)-12 + + _FOREGROUND_BLUE = 0x0001 + _FOREGROUND_GREEN = 0x0002 + _FOREGROUND_RED = 0x0004 + _FOREGROUND_INTENSITY = 0x0008 + + _BACKGROUND_BLUE = 0x0010 + _BACKGROUND_GREEN = 0x0020 + _BACKGROUND_RED = 0x0040 + _BACKGROUND_INTENSITY = 0x0080 + + _COMMON_LVB_REVERSE_VIDEO = 0x4000 + _COMMON_LVB_UNDERSCORE = 0x8000 + + # http://msdn.microsoft.com/en-us/library/ms682088%28VS.85%29.aspx + w32effects = { + 'none': -1, + 'black': 0, + 'red': _FOREGROUND_RED, + 'green': _FOREGROUND_GREEN, + 'yellow': _FOREGROUND_RED | _FOREGROUND_GREEN, + 'blue': _FOREGROUND_BLUE, + 'magenta': _FOREGROUND_BLUE | _FOREGROUND_RED, + 'cyan': _FOREGROUND_BLUE | _FOREGROUND_GREEN, + 'white': _FOREGROUND_RED | _FOREGROUND_GREEN | _FOREGROUND_BLUE, + 'bold': _FOREGROUND_INTENSITY, + 'black_background': 0x100, # unused value > 0x0f + 'red_background': _BACKGROUND_RED, + 'green_background': _BACKGROUND_GREEN, + 'yellow_background': _BACKGROUND_RED | _BACKGROUND_GREEN, + 'blue_background': _BACKGROUND_BLUE, + 'purple_background': _BACKGROUND_BLUE | _BACKGROUND_RED, + 'cyan_background': _BACKGROUND_BLUE | _BACKGROUND_GREEN, + 'white_background': (_BACKGROUND_RED | _BACKGROUND_GREEN | + _BACKGROUND_BLUE), + 'bold_background': _BACKGROUND_INTENSITY, + 'underline': _COMMON_LVB_UNDERSCORE, # double-byte charsets only + 'inverse': _COMMON_LVB_REVERSE_VIDEO, # double-byte charsets only + } + + passthrough = set([_FOREGROUND_INTENSITY, + _BACKGROUND_INTENSITY, + _COMMON_LVB_UNDERSCORE, + _COMMON_LVB_REVERSE_VIDEO]) + + stdout = _kernel32.GetStdHandle( + _STD_OUTPUT_HANDLE) # don't close the handle returned + if stdout is None or stdout == _INVALID_HANDLE_VALUE: + w32effects = None + else: + csbi = _CONSOLE_SCREEN_BUFFER_INFO() + if not _kernel32.GetConsoleScreenBufferInfo( + stdout, ctypes.byref(csbi)): + # stdout may not support GetConsoleScreenBufferInfo() + # when called from subprocess or redirected + w32effects = None + else: + origattr = csbi.wAttributes + ansire = re.compile('\033\[([^m]*)m([^\033]*)(.*)', + re.MULTILINE | re.DOTALL) + + def win32print(ui, writefunc, *msgs, **opts): + for text in msgs: + _win32print(ui, text, writefunc, **opts) + + def _win32print(ui, text, writefunc, **opts): + label = opts.get('label', '') + attr = origattr + + def mapcolor(val, attr): + if val == -1: + return origattr + elif val in passthrough: + return attr | val + elif val > 0x0f: + return (val & 0x70) | (attr & 0x8f) + else: + return (val & 0x07) | (attr & 0xf8) + + # determine console attributes based on labels + for l in label.split(): + style = ui._styles.get(l, '') + for effect in style.split(): + try: + attr = mapcolor(w32effects[effect], attr) + except KeyError: + # w32effects could not have certain attributes so we skip + # them if not found + pass + # hack to ensure regexp finds data + if not text.startswith('\033['): + text = '\033[m' + text + + # Look for ANSI-like codes embedded in text + m = re.match(ansire, text) + + try: + while m: + for sattr in m.group(1).split(';'): + if sattr: + attr = mapcolor(int(sattr), attr) + ui.flush() + _kernel32.SetConsoleTextAttribute(stdout, attr) + writefunc(m.group(2), **opts) + m = re.match(ansire, m.group(3)) + finally: + # Explicitly reset original attributes + ui.flush() + _kernel32.SetConsoleTextAttribute(stdout, origattr) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/commands.py --- a/mercurial/commands.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/commands.py Fri Mar 24 08:37:26 2017 -0700 @@ -11,17 +11,10 @@ import errno import os import re -import socket -import string -import sys -import tempfile -import time from .i18n import _ from .node import ( - bin, hex, - nullhex, nullid, nullrev, short, @@ -40,30 +33,22 @@ error, exchange, extensions, - formatter, graphmod, hbisect, help, hg, lock as lockmod, merge as mergemod, - minirst, obsolete, patch, phases, - policy, - pvec, pycompat, - repair, - revlog, - revset, + revsetlang, scmutil, server, sshserver, - sslutil, streamclone, templatekw, - templater, ui as uimod, util, ) @@ -92,6 +77,11 @@ _('do not prompt, automatically pick the first choice for all prompts')), ('q', 'quiet', None, _('suppress output')), ('v', 'verbose', None, _('enable additional output')), + ('', 'color', '', + # i18n: 'always', 'auto', 'never', and 'debug' are keywords + # and should not be translated + _("when to colorize (boolean, always, auto, never, or debug)"), + _('TYPE')), ('', 'config', [], _('set/override config option (use \'section.name=value\')'), _('CONFIG')), @@ -107,6 +97,8 @@ ('', 'version', None, _('output version information and exit')), ('h', 'help', None, _('display help and exit')), ('', 'hidden', False, _('consider hidden changesets')), + ('', 'pager', 'auto', + _("when to paginate (boolean, always, auto, or never)"), _('TYPE')), ] dryrunopts = [('n', 'dry-run', None, @@ -433,6 +425,8 @@ if linenumber and (not opts.get('changeset')) and (not opts.get('number')): raise error.Abort(_('at least one of -n/-c is required for -l')) + ui.pager('annotate') + if fm.isplain(): def makefunc(get, fmt): return lambda x: fmt(get(x)) @@ -892,7 +886,8 @@ # update state state['current'] = [node] hbisect.save_state(repo, state) - status = ui.system(command, environ={'HG_NODE': hex(node)}) + status = ui.system(command, environ={'HG_NODE': hex(node)}, + blockedtag='bisect_check') if status == 125: transition = "skip" elif status == 0: @@ -1228,6 +1223,7 @@ Returns 0. """ + ui.pager('branches') fm = ui.formatter('branches', opts) hexfunc = fm.hexfunc @@ -1264,6 +1260,7 @@ fmt = ' ' * padsize + ' %d:%s' fm.condwrite(not ui.quiet, 'rev node', fmt, rev, hexfunc(ctx.node()), label='log.changeset changeset.%s' % ctx.phasestr()) + fm.context(ctx=ctx) fm.data(active=isactive, closed=not isopen, current=current) if not ui.quiet: fm.plain(notice) @@ -1427,6 +1424,7 @@ ctx = scmutil.revsingle(repo, opts.get('rev')) m = scmutil.match(ctx, (file1,) + pats, opts) + ui.pager('cat') return cmdutil.cat(ui, repo, ctx, m, '', **opts) @command('^clone', @@ -1638,10 +1636,12 @@ release(lock, wlock) def _docommit(ui, repo, *pats, **opts): + opts = pycompat.byteskwargs(opts) if opts.get('interactive'): opts.pop('interactive') ret = cmdutil.dorecord(ui, repo, commit, None, False, - cmdutil.recordfilter, *pats, **opts) + cmdutil.recordfilter, *pats, + **pycompat.strkwargs(opts)) # ret can be 0 (no changes to record) or the value returned by # commit(), 1 if nothing changed or None on success. return 1 if ret == 0 else ret @@ -1704,25 +1704,23 @@ return 1 else: def commitfunc(ui, repo, message, match, opts): - backup = ui.backupconfig('phases', 'new-commit') + overrides = {} + if opts.get('secret'): + overrides[('phases', 'new-commit')] = 'secret' + baseui = repo.baseui - basebackup = baseui.backupconfig('phases', 'new-commit') - try: - if opts.get('secret'): - ui.setconfig('phases', 'new-commit', 'secret', 'commit') - # Propagate to subrepos - baseui.setconfig('phases', 'new-commit', 'secret', 'commit') - - editform = cmdutil.mergeeditform(repo[None], 'commit.normal') - editor = cmdutil.getcommiteditor(editform=editform, **opts) - return repo.commit(message, opts.get('user'), opts.get('date'), - match, - editor=editor, - extra=extra) - finally: - ui.restoreconfig(backup) - repo.baseui.restoreconfig(basebackup) - + with baseui.configoverride(overrides, 'commit'): + with ui.configoverride(overrides, 'commit'): + editform = cmdutil.mergeeditform(repo[None], + 'commit.normal') + editor = cmdutil.getcommiteditor( + editform=editform, **pycompat.strkwargs(opts)) + return repo.commit(message, + opts.get('user'), + opts.get('date'), + match, + editor=editor, + extra=extra) node = cmdutil.commit(ui, repo, commitfunc, pats, opts) @@ -1775,7 +1773,7 @@ if opts.get('local'): if not repo: raise error.Abort(_("can't use --local outside a repository")) - paths = [repo.join('hgrc')] + paths = [repo.vfs.join('hgrc')] elif opts.get('global'): paths = scmutil.systemrcpath() else: @@ -1799,9 +1797,10 @@ editor = ui.geteditor() ui.system("%s \"%s\"" % (editor, f), - onerr=error.Abort, errprefix=_("edit failed")) + onerr=error.Abort, errprefix=_("edit failed"), + blockedtag='config_edit') return - + ui.pager('config') fm = ui.formatter('config', opts) for f in scmutil.rcpath(): ui.debug('read config from: %s\n' % f) @@ -1814,7 +1813,7 @@ matched = False for section, name, value in ui.walkconfig(untrusted=untrusted): source = ui.configsource(section, name, untrusted) - value = str(value) + value = pycompat.bytestr(value) if fm.isplain(): source = source or 'none' value = value.replace('\n', '\\n') @@ -1866,1176 +1865,6 @@ with repo.wlock(False): return cmdutil.copy(ui, repo, pats, opts) -@command('debuginstall', [] + formatteropts, '', norepo=True) -def debuginstall(ui, **opts): - '''test Mercurial installation - - Returns 0 on success. - ''' - - def writetemp(contents): - (fd, name) = tempfile.mkstemp(prefix="hg-debuginstall-") - f = os.fdopen(fd, "wb") - f.write(contents) - f.close() - return name - - problems = 0 - - fm = ui.formatter('debuginstall', opts) - fm.startitem() - - # encoding - fm.write('encoding', _("checking encoding (%s)...\n"), encoding.encoding) - err = None - try: - encoding.fromlocal("test") - except error.Abort as inst: - err = inst - problems += 1 - fm.condwrite(err, 'encodingerror', _(" %s\n" - " (check that your locale is properly set)\n"), err) - - # Python - fm.write('pythonexe', _("checking Python executable (%s)\n"), - pycompat.sysexecutable) - fm.write('pythonver', _("checking Python version (%s)\n"), - ("%d.%d.%d" % sys.version_info[:3])) - fm.write('pythonlib', _("checking Python lib (%s)...\n"), - os.path.dirname(os.__file__)) - - security = set(sslutil.supportedprotocols) - if sslutil.hassni: - security.add('sni') - - fm.write('pythonsecurity', _("checking Python security support (%s)\n"), - fm.formatlist(sorted(security), name='protocol', - fmt='%s', sep=',')) - - # These are warnings, not errors. So don't increment problem count. This - # may change in the future. - if 'tls1.2' not in security: - fm.plain(_(' TLS 1.2 not supported by Python install; ' - 'network connections lack modern security\n')) - if 'sni' not in security: - fm.plain(_(' SNI not supported by Python install; may have ' - 'connectivity issues with some servers\n')) - - # TODO print CA cert info - - # hg version - hgver = util.version() - fm.write('hgver', _("checking Mercurial version (%s)\n"), - hgver.split('+')[0]) - fm.write('hgverextra', _("checking Mercurial custom build (%s)\n"), - '+'.join(hgver.split('+')[1:])) - - # compiled modules - fm.write('hgmodulepolicy', _("checking module policy (%s)\n"), - policy.policy) - fm.write('hgmodules', _("checking installed modules (%s)...\n"), - os.path.dirname(__file__)) - - err = None - try: - from . import ( - base85, - bdiff, - mpatch, - osutil, - ) - dir(bdiff), dir(mpatch), dir(base85), dir(osutil) # quiet pyflakes - except Exception as inst: - err = inst - problems += 1 - fm.condwrite(err, 'extensionserror', " %s\n", err) - - compengines = util.compengines._engines.values() - fm.write('compengines', _('checking registered compression engines (%s)\n'), - fm.formatlist(sorted(e.name() for e in compengines), - name='compengine', fmt='%s', sep=', ')) - fm.write('compenginesavail', _('checking available compression engines ' - '(%s)\n'), - fm.formatlist(sorted(e.name() for e in compengines - if e.available()), - name='compengine', fmt='%s', sep=', ')) - wirecompengines = util.compengines.supportedwireengines(util.SERVERROLE) - fm.write('compenginesserver', _('checking available compression engines ' - 'for wire protocol (%s)\n'), - fm.formatlist([e.name() for e in wirecompengines - if e.wireprotosupport()], - name='compengine', fmt='%s', sep=', ')) - - # templates - p = templater.templatepaths() - fm.write('templatedirs', 'checking templates (%s)...\n', ' '.join(p)) - fm.condwrite(not p, '', _(" no template directories found\n")) - if p: - m = templater.templatepath("map-cmdline.default") - if m: - # template found, check if it is working - err = None - try: - templater.templater.frommapfile(m) - except Exception as inst: - err = inst - p = None - fm.condwrite(err, 'defaulttemplateerror', " %s\n", err) - else: - p = None - fm.condwrite(p, 'defaulttemplate', - _("checking default template (%s)\n"), m) - fm.condwrite(not m, 'defaulttemplatenotfound', - _(" template '%s' not found\n"), "default") - if not p: - problems += 1 - fm.condwrite(not p, '', - _(" (templates seem to have been installed incorrectly)\n")) - - # editor - editor = ui.geteditor() - editor = util.expandpath(editor) - fm.write('editor', _("checking commit editor... (%s)\n"), editor) - cmdpath = util.findexe(pycompat.shlexsplit(editor)[0]) - fm.condwrite(not cmdpath and editor == 'vi', 'vinotfound', - _(" No commit editor set and can't find %s in PATH\n" - " (specify a commit editor in your configuration" - " file)\n"), not cmdpath and editor == 'vi' and editor) - fm.condwrite(not cmdpath and editor != 'vi', 'editornotfound', - _(" Can't find editor '%s' in PATH\n" - " (specify a commit editor in your configuration" - " file)\n"), not cmdpath and editor) - if not cmdpath and editor != 'vi': - problems += 1 - - # check username - username = None - err = None - try: - username = ui.username() - except error.Abort as e: - err = e - problems += 1 - - fm.condwrite(username, 'username', _("checking username (%s)\n"), username) - fm.condwrite(err, 'usernameerror', _("checking username...\n %s\n" - " (specify a username in your configuration file)\n"), err) - - fm.condwrite(not problems, '', - _("no problems detected\n")) - if not problems: - fm.data(problems=problems) - fm.condwrite(problems, 'problems', - _("%d problems detected," - " please check your install!\n"), problems) - fm.end() - - return problems - -@command('debugknown', [], _('REPO ID...'), norepo=True) -def debugknown(ui, repopath, *ids, **opts): - """test whether node ids are known to a repo - - Every ID must be a full-length hex node id string. Returns a list of 0s - and 1s indicating unknown/known. - """ - repo = hg.peer(ui, opts, repopath) - if not repo.capable('known'): - raise error.Abort("known() not supported by target repository") - flags = repo.known([bin(s) for s in ids]) - ui.write("%s\n" % ("".join([f and "1" or "0" for f in flags]))) - -@command('debuglabelcomplete', [], _('LABEL...')) -def debuglabelcomplete(ui, repo, *args): - '''backwards compatibility with old bash completion scripts (DEPRECATED)''' - debugnamecomplete(ui, repo, *args) - -@command('debugmergestate', [], '') -def debugmergestate(ui, repo, *args): - """print merge state - - Use --verbose to print out information about whether v1 or v2 merge state - was chosen.""" - def _hashornull(h): - if h == nullhex: - return 'null' - else: - return h - - def printrecords(version): - ui.write(('* version %s records\n') % version) - if version == 1: - records = v1records - else: - records = v2records - - for rtype, record in records: - # pretty print some record types - if rtype == 'L': - ui.write(('local: %s\n') % record) - elif rtype == 'O': - ui.write(('other: %s\n') % record) - elif rtype == 'm': - driver, mdstate = record.split('\0', 1) - ui.write(('merge driver: %s (state "%s")\n') - % (driver, mdstate)) - elif rtype in 'FDC': - r = record.split('\0') - f, state, hash, lfile, afile, anode, ofile = r[0:7] - if version == 1: - onode = 'not stored in v1 format' - flags = r[7] - else: - onode, flags = r[7:9] - ui.write(('file: %s (record type "%s", state "%s", hash %s)\n') - % (f, rtype, state, _hashornull(hash))) - ui.write((' local path: %s (flags "%s")\n') % (lfile, flags)) - ui.write((' ancestor path: %s (node %s)\n') - % (afile, _hashornull(anode))) - ui.write((' other path: %s (node %s)\n') - % (ofile, _hashornull(onode))) - elif rtype == 'f': - filename, rawextras = record.split('\0', 1) - extras = rawextras.split('\0') - i = 0 - extrastrings = [] - while i < len(extras): - extrastrings.append('%s = %s' % (extras[i], extras[i + 1])) - i += 2 - - ui.write(('file extras: %s (%s)\n') - % (filename, ', '.join(extrastrings))) - elif rtype == 'l': - labels = record.split('\0', 2) - labels = [l for l in labels if len(l) > 0] - ui.write(('labels:\n')) - ui.write((' local: %s\n' % labels[0])) - ui.write((' other: %s\n' % labels[1])) - if len(labels) > 2: - ui.write((' base: %s\n' % labels[2])) - else: - ui.write(('unrecognized entry: %s\t%s\n') - % (rtype, record.replace('\0', '\t'))) - - # Avoid mergestate.read() since it may raise an exception for unsupported - # merge state records. We shouldn't be doing this, but this is OK since this - # command is pretty low-level. - ms = mergemod.mergestate(repo) - - # sort so that reasonable information is on top - v1records = ms._readrecordsv1() - v2records = ms._readrecordsv2() - order = 'LOml' - def key(r): - idx = order.find(r[0]) - if idx == -1: - return (1, r[1]) - else: - return (0, idx) - v1records.sort(key=key) - v2records.sort(key=key) - - if not v1records and not v2records: - ui.write(('no merge state found\n')) - elif not v2records: - ui.note(('no version 2 merge state\n')) - printrecords(1) - elif ms._v1v2match(v1records, v2records): - ui.note(('v1 and v2 states match: using v2\n')) - printrecords(2) - else: - ui.note(('v1 and v2 states mismatch: using v1\n')) - printrecords(1) - if ui.verbose: - printrecords(2) - -@command('debugnamecomplete', [], _('NAME...')) -def debugnamecomplete(ui, repo, *args): - '''complete "names" - tags, open branch names, bookmark names''' - - names = set() - # since we previously only listed open branches, we will handle that - # specially (after this for loop) - for name, ns in repo.names.iteritems(): - if name != 'branches': - names.update(ns.listnames(repo)) - names.update(tag for (tag, heads, tip, closed) - in repo.branchmap().iterbranches() if not closed) - completions = set() - if not args: - args = [''] - for a in args: - completions.update(n for n in names if n.startswith(a)) - ui.write('\n'.join(sorted(completions))) - ui.write('\n') - -@command('debuglocks', - [('L', 'force-lock', None, _('free the store lock (DANGEROUS)')), - ('W', 'force-wlock', None, - _('free the working state lock (DANGEROUS)'))], - _('[OPTION]...')) -def debuglocks(ui, repo, **opts): - """show or modify state of locks - - By default, this command will show which locks are held. This - includes the user and process holding the lock, the amount of time - the lock has been held, and the machine name where the process is - running if it's not local. - - Locks protect the integrity of Mercurial's data, so should be - treated with care. System crashes or other interruptions may cause - locks to not be properly released, though Mercurial will usually - detect and remove such stale locks automatically. - - However, detecting stale locks may not always be possible (for - instance, on a shared filesystem). Removing locks may also be - blocked by filesystem permissions. - - Returns 0 if no locks are held. - - """ - - if opts.get('force_lock'): - repo.svfs.unlink('lock') - if opts.get('force_wlock'): - repo.vfs.unlink('wlock') - if opts.get('force_lock') or opts.get('force_lock'): - return 0 - - now = time.time() - held = 0 - - def report(vfs, name, method): - # this causes stale locks to get reaped for more accurate reporting - try: - l = method(False) - except error.LockHeld: - l = None - - if l: - l.release() - else: - try: - stat = vfs.lstat(name) - age = now - stat.st_mtime - user = util.username(stat.st_uid) - locker = vfs.readlock(name) - if ":" in locker: - host, pid = locker.split(':') - if host == socket.gethostname(): - locker = 'user %s, process %s' % (user, pid) - else: - locker = 'user %s, process %s, host %s' \ - % (user, pid, host) - ui.write(("%-6s %s (%ds)\n") % (name + ":", locker, age)) - return 1 - except OSError as e: - if e.errno != errno.ENOENT: - raise - - ui.write(("%-6s free\n") % (name + ":")) - return 0 - - held += report(repo.svfs, "lock", repo.lock) - held += report(repo.vfs, "wlock", repo.wlock) - - return held - -@command('debugobsolete', - [('', 'flags', 0, _('markers flag')), - ('', 'record-parents', False, - _('record parent information for the precursor')), - ('r', 'rev', [], _('display markers relevant to REV')), - ('', 'index', False, _('display index of the marker')), - ('', 'delete', [], _('delete markers specified by indices')), - ] + commitopts2 + formatteropts, - _('[OBSOLETED [REPLACEMENT ...]]')) -def debugobsolete(ui, repo, precursor=None, *successors, **opts): - """create arbitrary obsolete marker - - With no arguments, displays the list of obsolescence markers.""" - - def parsenodeid(s): - try: - # We do not use revsingle/revrange functions here to accept - # arbitrary node identifiers, possibly not present in the - # local repository. - n = bin(s) - if len(n) != len(nullid): - raise TypeError() - return n - except TypeError: - raise error.Abort('changeset references must be full hexadecimal ' - 'node identifiers') - - if opts.get('delete'): - indices = [] - for v in opts.get('delete'): - try: - indices.append(int(v)) - except ValueError: - raise error.Abort(_('invalid index value: %r') % v, - hint=_('use integers for indices')) - - if repo.currenttransaction(): - raise error.Abort(_('cannot delete obsmarkers in the middle ' - 'of transaction.')) - - with repo.lock(): - n = repair.deleteobsmarkers(repo.obsstore, indices) - ui.write(_('deleted %i obsolescence markers\n') % n) - - return - - if precursor is not None: - if opts['rev']: - raise error.Abort('cannot select revision when creating marker') - metadata = {} - metadata['user'] = opts['user'] or ui.username() - succs = tuple(parsenodeid(succ) for succ in successors) - l = repo.lock() - try: - tr = repo.transaction('debugobsolete') - try: - date = opts.get('date') - if date: - date = util.parsedate(date) - else: - date = None - prec = parsenodeid(precursor) - parents = None - if opts['record_parents']: - if prec not in repo.unfiltered(): - raise error.Abort('cannot used --record-parents on ' - 'unknown changesets') - parents = repo.unfiltered()[prec].parents() - parents = tuple(p.node() for p in parents) - repo.obsstore.create(tr, prec, succs, opts['flags'], - parents=parents, date=date, - metadata=metadata) - tr.close() - except ValueError as exc: - raise error.Abort(_('bad obsmarker input: %s') % exc) - finally: - tr.release() - finally: - l.release() - else: - if opts['rev']: - revs = scmutil.revrange(repo, opts['rev']) - nodes = [repo[r].node() for r in revs] - markers = list(obsolete.getmarkers(repo, nodes=nodes)) - markers.sort(key=lambda x: x._data) - else: - markers = obsolete.getmarkers(repo) - - markerstoiter = markers - isrelevant = lambda m: True - if opts.get('rev') and opts.get('index'): - markerstoiter = obsolete.getmarkers(repo) - markerset = set(markers) - isrelevant = lambda m: m in markerset - - fm = ui.formatter('debugobsolete', opts) - for i, m in enumerate(markerstoiter): - if not isrelevant(m): - # marker can be irrelevant when we're iterating over a set - # of markers (markerstoiter) which is bigger than the set - # of markers we want to display (markers) - # this can happen if both --index and --rev options are - # provided and thus we need to iterate over all of the markers - # to get the correct indices, but only display the ones that - # are relevant to --rev value - continue - fm.startitem() - ind = i if opts.get('index') else None - cmdutil.showmarker(fm, m, index=ind) - fm.end() - -@command('debugpathcomplete', - [('f', 'full', None, _('complete an entire path')), - ('n', 'normal', None, _('show only normal files')), - ('a', 'added', None, _('show only added files')), - ('r', 'removed', None, _('show only removed files'))], - _('FILESPEC...')) -def debugpathcomplete(ui, repo, *specs, **opts): - '''complete part or all of a tracked path - - This command supports shells that offer path name completion. It - currently completes only files already known to the dirstate. - - Completion extends only to the next path segment unless - --full is specified, in which case entire paths are used.''' - - def complete(path, acceptable): - dirstate = repo.dirstate - spec = os.path.normpath(os.path.join(pycompat.getcwd(), path)) - rootdir = repo.root + pycompat.ossep - if spec != repo.root and not spec.startswith(rootdir): - return [], [] - if os.path.isdir(spec): - spec += '/' - spec = spec[len(rootdir):] - fixpaths = pycompat.ossep != '/' - if fixpaths: - spec = spec.replace(pycompat.ossep, '/') - speclen = len(spec) - fullpaths = opts['full'] - files, dirs = set(), set() - adddir, addfile = dirs.add, files.add - for f, st in dirstate.iteritems(): - if f.startswith(spec) and st[0] in acceptable: - if fixpaths: - f = f.replace('/', pycompat.ossep) - if fullpaths: - addfile(f) - continue - s = f.find(pycompat.ossep, speclen) - if s >= 0: - adddir(f[:s]) - else: - addfile(f) - return files, dirs - - acceptable = '' - if opts['normal']: - acceptable += 'nm' - if opts['added']: - acceptable += 'a' - if opts['removed']: - acceptable += 'r' - cwd = repo.getcwd() - if not specs: - specs = ['.'] - - files, dirs = set(), set() - for spec in specs: - f, d = complete(spec, acceptable or 'nmar') - files.update(f) - dirs.update(d) - files.update(dirs) - ui.write('\n'.join(repo.pathto(p, cwd) for p in sorted(files))) - ui.write('\n') - -@command('debugpushkey', [], _('REPO NAMESPACE [KEY OLD NEW]'), norepo=True) -def debugpushkey(ui, repopath, namespace, *keyinfo, **opts): - '''access the pushkey key/value protocol - - With two args, list the keys in the given namespace. - - With five args, set a key to new if it currently is set to old. - Reports success or failure. - ''' - - target = hg.peer(ui, {}, repopath) - if keyinfo: - key, old, new = keyinfo - r = target.pushkey(namespace, key, old, new) - ui.status(str(r) + '\n') - return not r - else: - for k, v in sorted(target.listkeys(namespace).iteritems()): - ui.write("%s\t%s\n" % (k.encode('string-escape'), - v.encode('string-escape'))) - -@command('debugpvec', [], _('A B')) -def debugpvec(ui, repo, a, b=None): - ca = scmutil.revsingle(repo, a) - cb = scmutil.revsingle(repo, b) - pa = pvec.ctxpvec(ca) - pb = pvec.ctxpvec(cb) - if pa == pb: - rel = "=" - elif pa > pb: - rel = ">" - elif pa < pb: - rel = "<" - elif pa | pb: - rel = "|" - ui.write(_("a: %s\n") % pa) - ui.write(_("b: %s\n") % pb) - ui.write(_("depth(a): %d depth(b): %d\n") % (pa._depth, pb._depth)) - ui.write(_("delta: %d hdist: %d distance: %d relation: %s\n") % - (abs(pa._depth - pb._depth), pvec._hamming(pa._vec, pb._vec), - pa.distance(pb), rel)) - -@command('debugrebuilddirstate|debugrebuildstate', - [('r', 'rev', '', _('revision to rebuild to'), _('REV')), - ('', 'minimal', None, _('only rebuild files that are inconsistent with ' - 'the working copy parent')), - ], - _('[-r REV]')) -def debugrebuilddirstate(ui, repo, rev, **opts): - """rebuild the dirstate as it would look like for the given revision - - If no revision is specified the first current parent will be used. - - The dirstate will be set to the files of the given revision. - The actual working directory content or existing dirstate - information such as adds or removes is not considered. - - ``minimal`` will only rebuild the dirstate status for files that claim to be - tracked but are not in the parent manifest, or that exist in the parent - manifest but are not in the dirstate. It will not change adds, removes, or - modified files that are in the working copy parent. - - One use of this command is to make the next :hg:`status` invocation - check the actual file content. - """ - ctx = scmutil.revsingle(repo, rev) - with repo.wlock(): - dirstate = repo.dirstate - changedfiles = None - # See command doc for what minimal does. - if opts.get('minimal'): - manifestfiles = set(ctx.manifest().keys()) - dirstatefiles = set(dirstate) - manifestonly = manifestfiles - dirstatefiles - dsonly = dirstatefiles - manifestfiles - dsnotadded = set(f for f in dsonly if dirstate[f] != 'a') - changedfiles = manifestonly | dsnotadded - - dirstate.rebuild(ctx.node(), ctx.manifest(), changedfiles) - -@command('debugrebuildfncache', [], '') -def debugrebuildfncache(ui, repo): - """rebuild the fncache file""" - repair.rebuildfncache(ui, repo) - -@command('debugrename', - [('r', 'rev', '', _('revision to debug'), _('REV'))], - _('[-r REV] FILE')) -def debugrename(ui, repo, file1, *pats, **opts): - """dump rename information""" - - ctx = scmutil.revsingle(repo, opts.get('rev')) - m = scmutil.match(ctx, (file1,) + pats, opts) - for abs in ctx.walk(m): - fctx = ctx[abs] - o = fctx.filelog().renamed(fctx.filenode()) - rel = m.rel(abs) - if o: - ui.write(_("%s renamed from %s:%s\n") % (rel, o[0], hex(o[1]))) - else: - ui.write(_("%s not renamed\n") % rel) - -@command('debugrevlog', debugrevlogopts + - [('d', 'dump', False, _('dump index data'))], - _('-c|-m|FILE'), - optionalrepo=True) -def debugrevlog(ui, repo, file_=None, **opts): - """show data and statistics about a revlog""" - r = cmdutil.openrevlog(repo, 'debugrevlog', file_, opts) - - if opts.get("dump"): - numrevs = len(r) - ui.write(("# rev p1rev p2rev start end deltastart base p1 p2" - " rawsize totalsize compression heads chainlen\n")) - ts = 0 - heads = set() - - for rev in xrange(numrevs): - dbase = r.deltaparent(rev) - if dbase == -1: - dbase = rev - cbase = r.chainbase(rev) - clen = r.chainlen(rev) - p1, p2 = r.parentrevs(rev) - rs = r.rawsize(rev) - ts = ts + rs - heads -= set(r.parentrevs(rev)) - heads.add(rev) - try: - compression = ts / r.end(rev) - except ZeroDivisionError: - compression = 0 - ui.write("%5d %5d %5d %5d %5d %10d %4d %4d %4d %7d %9d " - "%11d %5d %8d\n" % - (rev, p1, p2, r.start(rev), r.end(rev), - r.start(dbase), r.start(cbase), - r.start(p1), r.start(p2), - rs, ts, compression, len(heads), clen)) - return 0 - - v = r.version - format = v & 0xFFFF - flags = [] - gdelta = False - if v & revlog.REVLOGNGINLINEDATA: - flags.append('inline') - if v & revlog.REVLOGGENERALDELTA: - gdelta = True - flags.append('generaldelta') - if not flags: - flags = ['(none)'] - - nummerges = 0 - numfull = 0 - numprev = 0 - nump1 = 0 - nump2 = 0 - numother = 0 - nump1prev = 0 - nump2prev = 0 - chainlengths = [] - - datasize = [None, 0, 0] - fullsize = [None, 0, 0] - deltasize = [None, 0, 0] - chunktypecounts = {} - chunktypesizes = {} - - def addsize(size, l): - if l[0] is None or size < l[0]: - l[0] = size - if size > l[1]: - l[1] = size - l[2] += size - - numrevs = len(r) - for rev in xrange(numrevs): - p1, p2 = r.parentrevs(rev) - delta = r.deltaparent(rev) - if format > 0: - addsize(r.rawsize(rev), datasize) - if p2 != nullrev: - nummerges += 1 - size = r.length(rev) - if delta == nullrev: - chainlengths.append(0) - numfull += 1 - addsize(size, fullsize) - else: - chainlengths.append(chainlengths[delta] + 1) - addsize(size, deltasize) - if delta == rev - 1: - numprev += 1 - if delta == p1: - nump1prev += 1 - elif delta == p2: - nump2prev += 1 - elif delta == p1: - nump1 += 1 - elif delta == p2: - nump2 += 1 - elif delta != nullrev: - numother += 1 - - # Obtain data on the raw chunks in the revlog. - chunk = r._chunkraw(rev, rev)[1] - if chunk: - chunktype = chunk[0] - else: - chunktype = 'empty' - - if chunktype not in chunktypecounts: - chunktypecounts[chunktype] = 0 - chunktypesizes[chunktype] = 0 - - chunktypecounts[chunktype] += 1 - chunktypesizes[chunktype] += size - - # Adjust size min value for empty cases - for size in (datasize, fullsize, deltasize): - if size[0] is None: - size[0] = 0 - - numdeltas = numrevs - numfull - numoprev = numprev - nump1prev - nump2prev - totalrawsize = datasize[2] - datasize[2] /= numrevs - fulltotal = fullsize[2] - fullsize[2] /= numfull - deltatotal = deltasize[2] - if numrevs - numfull > 0: - deltasize[2] /= numrevs - numfull - totalsize = fulltotal + deltatotal - avgchainlen = sum(chainlengths) / numrevs - maxchainlen = max(chainlengths) - compratio = 1 - if totalsize: - compratio = totalrawsize / totalsize - - basedfmtstr = '%%%dd\n' - basepcfmtstr = '%%%dd %s(%%5.2f%%%%)\n' - - def dfmtstr(max): - return basedfmtstr % len(str(max)) - def pcfmtstr(max, padding=0): - return basepcfmtstr % (len(str(max)), ' ' * padding) - - def pcfmt(value, total): - if total: - return (value, 100 * float(value) / total) - else: - return value, 100.0 - - ui.write(('format : %d\n') % format) - ui.write(('flags : %s\n') % ', '.join(flags)) - - ui.write('\n') - fmt = pcfmtstr(totalsize) - fmt2 = dfmtstr(totalsize) - ui.write(('revisions : ') + fmt2 % numrevs) - ui.write((' merges : ') + fmt % pcfmt(nummerges, numrevs)) - ui.write((' normal : ') + fmt % pcfmt(numrevs - nummerges, numrevs)) - ui.write(('revisions : ') + fmt2 % numrevs) - ui.write((' full : ') + fmt % pcfmt(numfull, numrevs)) - ui.write((' deltas : ') + fmt % pcfmt(numdeltas, numrevs)) - ui.write(('revision size : ') + fmt2 % totalsize) - ui.write((' full : ') + fmt % pcfmt(fulltotal, totalsize)) - ui.write((' deltas : ') + fmt % pcfmt(deltatotal, totalsize)) - - def fmtchunktype(chunktype): - if chunktype == 'empty': - return ' %s : ' % chunktype - elif chunktype in string.ascii_letters: - return ' 0x%s (%s) : ' % (hex(chunktype), chunktype) - else: - return ' 0x%s : ' % hex(chunktype) - - ui.write('\n') - ui.write(('chunks : ') + fmt2 % numrevs) - for chunktype in sorted(chunktypecounts): - ui.write(fmtchunktype(chunktype)) - ui.write(fmt % pcfmt(chunktypecounts[chunktype], numrevs)) - ui.write(('chunks size : ') + fmt2 % totalsize) - for chunktype in sorted(chunktypecounts): - ui.write(fmtchunktype(chunktype)) - ui.write(fmt % pcfmt(chunktypesizes[chunktype], totalsize)) - - ui.write('\n') - fmt = dfmtstr(max(avgchainlen, compratio)) - ui.write(('avg chain length : ') + fmt % avgchainlen) - ui.write(('max chain length : ') + fmt % maxchainlen) - ui.write(('compression ratio : ') + fmt % compratio) - - if format > 0: - ui.write('\n') - ui.write(('uncompressed data size (min/max/avg) : %d / %d / %d\n') - % tuple(datasize)) - ui.write(('full revision size (min/max/avg) : %d / %d / %d\n') - % tuple(fullsize)) - ui.write(('delta size (min/max/avg) : %d / %d / %d\n') - % tuple(deltasize)) - - if numdeltas > 0: - ui.write('\n') - fmt = pcfmtstr(numdeltas) - fmt2 = pcfmtstr(numdeltas, 4) - ui.write(('deltas against prev : ') + fmt % pcfmt(numprev, numdeltas)) - if numprev > 0: - ui.write((' where prev = p1 : ') + fmt2 % pcfmt(nump1prev, - numprev)) - ui.write((' where prev = p2 : ') + fmt2 % pcfmt(nump2prev, - numprev)) - ui.write((' other : ') + fmt2 % pcfmt(numoprev, - numprev)) - if gdelta: - ui.write(('deltas against p1 : ') - + fmt % pcfmt(nump1, numdeltas)) - ui.write(('deltas against p2 : ') - + fmt % pcfmt(nump2, numdeltas)) - ui.write(('deltas against other : ') + fmt % pcfmt(numother, - numdeltas)) - -@command('debugrevspec', - [('', 'optimize', None, - _('print parsed tree after optimizing (DEPRECATED)')), - ('p', 'show-stage', [], - _('print parsed tree at the given stage'), _('NAME')), - ('', 'no-optimized', False, _('evaluate tree without optimization')), - ('', 'verify-optimized', False, _('verify optimized result')), - ], - ('REVSPEC')) -def debugrevspec(ui, repo, expr, **opts): - """parse and apply a revision specification - - Use -p/--show-stage option to print the parsed tree at the given stages. - Use -p all to print tree at every stage. - - Use --verify-optimized to compare the optimized result with the unoptimized - one. Returns 1 if the optimized result differs. - """ - stages = [ - ('parsed', lambda tree: tree), - ('expanded', lambda tree: revset.expandaliases(ui, tree)), - ('concatenated', revset.foldconcat), - ('analyzed', revset.analyze), - ('optimized', revset.optimize), - ] - if opts['no_optimized']: - stages = stages[:-1] - if opts['verify_optimized'] and opts['no_optimized']: - raise error.Abort(_('cannot use --verify-optimized with ' - '--no-optimized')) - stagenames = set(n for n, f in stages) - - showalways = set() - showchanged = set() - if ui.verbose and not opts['show_stage']: - # show parsed tree by --verbose (deprecated) - showalways.add('parsed') - showchanged.update(['expanded', 'concatenated']) - if opts['optimize']: - showalways.add('optimized') - if opts['show_stage'] and opts['optimize']: - raise error.Abort(_('cannot use --optimize with --show-stage')) - if opts['show_stage'] == ['all']: - showalways.update(stagenames) - else: - for n in opts['show_stage']: - if n not in stagenames: - raise error.Abort(_('invalid stage name: %s') % n) - showalways.update(opts['show_stage']) - - treebystage = {} - printedtree = None - tree = revset.parse(expr, lookup=repo.__contains__) - for n, f in stages: - treebystage[n] = tree = f(tree) - if n in showalways or (n in showchanged and tree != printedtree): - if opts['show_stage'] or n != 'parsed': - ui.write(("* %s:\n") % n) - ui.write(revset.prettyformat(tree), "\n") - printedtree = tree - - if opts['verify_optimized']: - arevs = revset.makematcher(treebystage['analyzed'])(repo) - brevs = revset.makematcher(treebystage['optimized'])(repo) - if ui.verbose: - ui.note(("* analyzed set:\n"), revset.prettyformatset(arevs), "\n") - ui.note(("* optimized set:\n"), revset.prettyformatset(brevs), "\n") - arevs = list(arevs) - brevs = list(brevs) - if arevs == brevs: - return 0 - ui.write(('--- analyzed\n'), label='diff.file_a') - ui.write(('+++ optimized\n'), label='diff.file_b') - sm = difflib.SequenceMatcher(None, arevs, brevs) - for tag, alo, ahi, blo, bhi in sm.get_opcodes(): - if tag in ('delete', 'replace'): - for c in arevs[alo:ahi]: - ui.write('-%s\n' % c, label='diff.deleted') - if tag in ('insert', 'replace'): - for c in brevs[blo:bhi]: - ui.write('+%s\n' % c, label='diff.inserted') - if tag == 'equal': - for c in arevs[alo:ahi]: - ui.write(' %s\n' % c) - return 1 - - func = revset.makematcher(tree) - revs = func(repo) - if ui.verbose: - ui.note(("* set:\n"), revset.prettyformatset(revs), "\n") - for c in revs: - ui.write("%s\n" % c) - -@command('debugsetparents', [], _('REV1 [REV2]')) -def debugsetparents(ui, repo, rev1, rev2=None): - """manually set the parents of the current working directory - - This is useful for writing repository conversion tools, but should - be used with care. For example, neither the working directory nor the - dirstate is updated, so file status may be incorrect after running this - command. - - Returns 0 on success. - """ - - r1 = scmutil.revsingle(repo, rev1).node() - r2 = scmutil.revsingle(repo, rev2, 'null').node() - - with repo.wlock(): - repo.setparents(r1, r2) - -@command('debugdirstate|debugstate', - [('', 'nodates', None, _('do not display the saved mtime')), - ('', 'datesort', None, _('sort by saved mtime'))], - _('[OPTION]...')) -def debugstate(ui, repo, **opts): - """show the contents of the current dirstate""" - - nodates = opts.get('nodates') - datesort = opts.get('datesort') - - timestr = "" - if datesort: - keyfunc = lambda x: (x[1][3], x[0]) # sort by mtime, then by filename - else: - keyfunc = None # sort by filename - for file_, ent in sorted(repo.dirstate._map.iteritems(), key=keyfunc): - if ent[3] == -1: - timestr = 'unset ' - elif nodates: - timestr = 'set ' - else: - timestr = time.strftime("%Y-%m-%d %H:%M:%S ", - time.localtime(ent[3])) - if ent[1] & 0o20000: - mode = 'lnk' - else: - mode = '%3o' % (ent[1] & 0o777 & ~util.umask) - ui.write("%c %s %10d %s%s\n" % (ent[0], mode, ent[2], timestr, file_)) - for f in repo.dirstate.copies(): - ui.write(_("copy: %s -> %s\n") % (repo.dirstate.copied(f), f)) - -@command('debugsub', - [('r', 'rev', '', - _('revision to check'), _('REV'))], - _('[-r REV] [REV]')) -def debugsub(ui, repo, rev=None): - ctx = scmutil.revsingle(repo, rev, None) - for k, v in sorted(ctx.substate.items()): - ui.write(('path %s\n') % k) - ui.write((' source %s\n') % v[0]) - ui.write((' revision %s\n') % v[1]) - -@command('debugsuccessorssets', - [], - _('[REV]')) -def debugsuccessorssets(ui, repo, *revs): - """show set of successors for revision - - A successors set of changeset A is a consistent group of revisions that - succeed A. It contains non-obsolete changesets only. - - In most cases a changeset A has a single successors set containing a single - successor (changeset A replaced by A'). - - A changeset that is made obsolete with no successors are called "pruned". - Such changesets have no successors sets at all. - - A changeset that has been "split" will have a successors set containing - more than one successor. - - A changeset that has been rewritten in multiple different ways is called - "divergent". Such changesets have multiple successor sets (each of which - may also be split, i.e. have multiple successors). - - Results are displayed as follows:: - - - - - - - - Here rev2 has two possible (i.e. divergent) successors sets. The first - holds one element, whereas the second holds three (i.e. the changeset has - been split). - """ - # passed to successorssets caching computation from one call to another - cache = {} - ctx2str = str - node2str = short - if ui.debug(): - def ctx2str(ctx): - return ctx.hex() - node2str = hex - for rev in scmutil.revrange(repo, revs): - ctx = repo[rev] - ui.write('%s\n'% ctx2str(ctx)) - for succsset in obsolete.successorssets(repo, ctx.node(), cache): - if succsset: - ui.write(' ') - ui.write(node2str(succsset[0])) - for node in succsset[1:]: - ui.write(' ') - ui.write(node2str(node)) - ui.write('\n') - -@command('debugtemplate', - [('r', 'rev', [], _('apply template on changesets'), _('REV')), - ('D', 'define', [], _('define template keyword'), _('KEY=VALUE'))], - _('[-r REV]... [-D KEY=VALUE]... TEMPLATE'), - optionalrepo=True) -def debugtemplate(ui, repo, tmpl, **opts): - """parse and apply a template - - If -r/--rev is given, the template is processed as a log template and - applied to the given changesets. Otherwise, it is processed as a generic - template. - - Use --verbose to print the parsed tree. - """ - revs = None - if opts['rev']: - if repo is None: - raise error.RepoError(_('there is no Mercurial repository here ' - '(.hg not found)')) - revs = scmutil.revrange(repo, opts['rev']) - - props = {} - for d in opts['define']: - try: - k, v = (e.strip() for e in d.split('=', 1)) - if not k: - raise ValueError - props[k] = v - except ValueError: - raise error.Abort(_('malformed keyword definition: %s') % d) - - if ui.verbose: - aliases = ui.configitems('templatealias') - tree = templater.parse(tmpl) - ui.note(templater.prettyformat(tree), '\n') - newtree = templater.expandaliases(tree, aliases) - if newtree != tree: - ui.note(("* expanded:\n"), templater.prettyformat(newtree), '\n') - - mapfile = None - if revs is None: - k = 'debugtemplate' - t = formatter.maketemplater(ui, k, tmpl) - ui.write(templater.stringify(t(k, **props))) - else: - displayer = cmdutil.changeset_templater(ui, repo, None, opts, tmpl, - mapfile, buffered=False) - for r in revs: - displayer.show(repo[r], **props) - displayer.close() - -@command('debugwalk', walkopts, _('[OPTION]... [FILE]...'), inferrepo=True) -def debugwalk(ui, repo, *pats, **opts): - """show how files match on given patterns""" - m = scmutil.match(repo[None], pats, opts) - items = list(repo.walk(m)) - if not items: - return - f = lambda fn: fn - if ui.configbool('ui', 'slash') and pycompat.ossep != '/': - f = lambda fn: util.normpath(fn) - fmt = 'f %%-%ds %%-%ds %%s' % ( - max([len(abs) for abs in items]), - max([len(m.rel(abs)) for abs in items])) - for abs in items: - line = fmt % (abs, f(m.rel(abs)), m.exact(abs) and 'exact' or '') - ui.write("%s\n" % line.rstrip()) - -@command('debugwireargs', - [('', 'three', '', 'three'), - ('', 'four', '', 'four'), - ('', 'five', '', 'five'), - ] + remoteopts, - _('REPO [OPTIONS]... [ONE [TWO]]'), - norepo=True) -def debugwireargs(ui, repopath, *vals, **opts): - repo = hg.peer(ui, opts, repopath) - for opt in remoteopts: - del opts[opt[1]] - args = {} - for k, v in opts.iteritems(): - if v: - args[k] = v - # run twice to check that we don't mess up the stream for the next command - res1 = repo.debugwireargs(*vals, **args) - res2 = repo.debugwireargs(*vals, **args) - ui.write("%s\n" % res1) - if res1 != res2: - ui.warn("%s\n" % res2) - @command('^diff', [('r', 'rev', [], _('revision'), _('REV')), ('c', 'change', '', _('change made by revision'), _('REV')) @@ -3119,6 +1948,7 @@ diffopts = patch.diffallopts(ui, opts) m = scmutil.match(repo[node2], pats, opts) + ui.pager('diff') cmdutil.diffordiffstat(ui, repo, diffopts, node1, node2, m, stat=stat, listsubrepos=opts.get('subrepos'), root=opts.get('root')) @@ -3200,6 +2030,7 @@ ui.note(_('exporting patches:\n')) else: ui.note(_('exporting patch:\n')) + ui.pager('export') cmdutil.export(repo, revs, template=opts.get('output'), switch_parent=opts.get('switch_parent'), opts=patch.diffallopts(ui, opts)) @@ -3253,7 +2084,7 @@ Returns 0 if a match is found, 1 otherwise. """ - ctx = scmutil.revsingle(repo, opts.get('rev'), None) + ctx = scmutil.revsingle(repo, opts.get(r'rev'), None) end = '\n' if opts.get('print0'): @@ -3261,6 +2092,7 @@ fmt = '%s' + end m = scmutil.match(ctx, pats, opts) + ui.pager('files') with ui.formatter('files', opts) as fm: return cmdutil.files(ui, ctx, m, fm, fmt, opts.get('subrepos')) @@ -3552,7 +2384,7 @@ # remove state when we complete successfully if not opts.get('dry_run'): - util.unlinkpath(repo.join('graftstate'), ignoremissing=True) + repo.vfs.unlinkpath('graftstate', ignoremissing=True) return 0 @@ -3782,6 +2614,7 @@ except error.LookupError: pass + ui.pager('grep') fm = ui.formatter('grep', opts) for ctx in cmdutil.walkchangerevs(repo, matchfn, opts, prep): rev = ctx.rev() @@ -3872,6 +2705,7 @@ if not heads: return 1 + ui.pager('heads') heads = sorted(heads, key=lambda x: -x.rev()) displayer = cmdutil.show_changeset(ui, repo, opts) for ctx in heads: @@ -3897,11 +2731,6 @@ Returns 0 if successful. """ - textwidth = ui.configint('ui', 'textwidth', 78) - termwidth = ui.termwidth() - 2 - if textwidth <= 0 or termwidth < textwidth: - textwidth = termwidth - keep = opts.get('system') or [] if len(keep) == 0: if pycompat.sysplatform.startswith('win'): @@ -3916,36 +2745,8 @@ if ui.verbose: keep.append('verbose') - section = None - subtopic = None - if name and '.' in name: - name, remaining = name.split('.', 1) - remaining = encoding.lower(remaining) - if '.' in remaining: - subtopic, section = remaining.split('.', 1) - else: - if name in help.subtopics: - subtopic = remaining - else: - section = remaining - - text = help.help_(ui, name, subtopic=subtopic, **opts) - - formatted, pruned = minirst.format(text, textwidth, keep=keep, - section=section) - - # We could have been given a weird ".foo" section without a name - # to look for, or we could have simply failed to found "foo.bar" - # because bar isn't a section of foo - if section and not (formatted and name): - raise error.Abort(_("help section not found")) - - if 'verbose' in pruned: - keep.append('omitted') - else: - keep.append('notomitted') - formatted, pruned = minirst.format(text, textwidth, keep=keep, - section=section) + formatted = help.formattedhelp(ui, name, keep=keep, **opts) + ui.pager('help') ui.write(formatted) @@ -4127,8 +2928,9 @@ Import a list of patches and commit them individually (unless --no-commit is specified). - To read a patch from standard input, use "-" as the patch name. If - a URL is specified, the patch will be downloaded from there. + To read a patch from standard input (stdin), use "-" as the patch + name. If a URL is specified, the patch will be downloaded from + there. Import first applies changes to the working directory (unless --bypass is specified), import will abort if there are outstanding @@ -4198,6 +3000,10 @@ hg import incoming-patches.mbox + - import patches from stdin:: + + hg import - + - attempt to exactly restore an exported changeset (not always possible):: @@ -4392,6 +3198,7 @@ if 'bookmarks' not in other.listkeys('namespaces'): ui.warn(_("remote doesn't support bookmarks\n")) return 0 + ui.pager('incoming') ui.status(_('comparing with %s\n') % util.hidepassword(source)) return bookmarks.incoming(ui, repo, other) @@ -4458,6 +3265,7 @@ m = scmutil.match(ctx, pats, opts, default='relglob', badfn=lambda x, y: False) + ui.pager('locate') for abs in ctx.matches(m): if opts.get('fullpath'): ui.write(repo.wjoin(abs), end) @@ -4588,12 +3396,13 @@ Returns 0 on success. """ + opts = pycompat.byteskwargs(opts) if opts.get('follow') and opts.get('rev'): - opts['rev'] = [revset.formatspec('reverse(::%lr)', opts.get('rev'))] + opts['rev'] = [revsetlang.formatspec('reverse(::%lr)', opts.get('rev'))] del opts['follow'] if opts.get('graph'): - return cmdutil.graphlog(ui, repo, *pats, **opts) + return cmdutil.graphlog(ui, repo, pats, opts) revs, expr, filematcher = cmdutil.getlogrevs(repo, pats, opts) limit = cmdutil.loglimit(opts) @@ -4606,6 +3415,7 @@ endrev = scmutil.revrange(repo, opts.get('rev')).max() + 1 getrenamed = templatekw.getrenamedfn(repo, endrev=endrev) + ui.pager('log') displayer = cmdutil.show_changeset(ui, repo, opts, buffered=True) for rev in revs: if count == limit: @@ -4648,7 +3458,6 @@ Returns 0 on success. """ - fm = ui.formatter('manifest', opts) if opts.get('all'): @@ -4664,6 +3473,7 @@ for fn, b, size in repo.store.datafiles(): if size != 0 and fn[-slen:] == suffix and fn[:plen] == prefix: res.append(fn[plen:-slen]) + ui.pager('manifest') for f in res: fm.startitem() fm.write("path", '%s\n', f) @@ -4680,6 +3490,7 @@ mode = {'l': '644', 'x': '755', '': '644'} ctx = scmutil.revsingle(repo, node) mf = ctx.manifest() + ui.pager('manifest') for f in ctx: fm.startitem() fl = ctx[f].flags() @@ -4812,6 +3623,7 @@ return revdag = cmdutil.graphrevs(repo, o, opts) + ui.pager('outgoing') displayer = cmdutil.show_changeset(ui, repo, opts, buffered=True) cmdutil.displaygraph(ui, repo, revdag, displayer, graphmod.asciiedges) cmdutil.outgoinghooks(ui, repo, other, opts, o) @@ -4825,6 +3637,7 @@ ui.warn(_("remote doesn't support bookmarks\n")) return 0 ui.status(_('comparing with %s\n') % util.hidepassword(dest)) + ui.pager('outgoing') return bookmarks.outgoing(ui, repo, other) repo._subtoppath = ui.expandpath(dest or 'default-push', dest or 'default') @@ -4921,6 +3734,7 @@ Returns 0 on success. """ + ui.pager('paths') if search: pathitems = [(name, path) for name, path in ui.paths.iteritems() if name == search] @@ -5268,7 +4082,7 @@ elif path.pushrev: # It doesn't make any sense to specify ancestor revisions. So limit # to DAG heads to make discovery simpler. - expr = revset.formatspec('heads(%r)', path.pushrev) + expr = revsetlang.formatspec('heads(%r)', path.pushrev) revs = scmutil.revrange(repo, [expr]) revs = [repo[rev].node() for rev in revs] if not revs: @@ -5434,6 +4248,8 @@ - :hg:`resolve -l`: list files which had or still have conflicts. In the printed list, ``U`` = unresolved and ``R`` = resolved. + You can use ``set:unresolved()`` or ``set:resolved()`` to filter + the list. See :hg:`help filesets` for details. .. note:: @@ -5457,6 +4273,7 @@ hint=('use --all to re-merge all unresolved files')) if show: + ui.pager('resolve') fm = ui.formatter('resolve', opts) ms = mergemod.mergestate.read(repo) m = scmutil.match(repo[None], pats, opts) @@ -5780,8 +4597,8 @@ ('', 'webdir-conf', '', _('name of the hgweb config file (DEPRECATED)'), _('FILE')), ('', 'pid-file', '', _('name of file to write process ID to'), _('FILE')), - ('', 'stdio', None, _('for remote clients')), - ('', 'cmdserver', '', _('for remote clients'), _('MODE')), + ('', 'stdio', None, _('for remote clients (ADVANCED)')), + ('', 'cmdserver', '', _('for remote clients (ADVANCED)'), _('MODE')), ('t', 'templates', '', _('web templates to use'), _('TEMPLATE')), ('', 'style', '', _('template style to use'), _('STYLE')), ('6', 'ipv6', None, _('use IPv6 in addition to IPv4')), @@ -5904,6 +4721,7 @@ Returns 0 on success. """ + opts = pycompat.byteskwargs(opts) revs = opts.get('rev') change = opts.get('change') @@ -5916,7 +4734,7 @@ else: node1, node2 = scmutil.revpair(repo, revs) - if pats: + if pats or ui.configbool('commands', 'status.relative'): cwd = repo.getcwd() else: cwd = '' @@ -5940,12 +4758,13 @@ stat = repo.status(node1, node2, m, 'ignored' in show, 'clean' in show, 'unknown' in show, opts.get('subrepos')) - changestates = zip(states, 'MAR!?IC', stat) + changestates = zip(states, pycompat.iterbytestr('MAR!?IC'), stat) if (opts.get('all') or opts.get('copies') or ui.configbool('ui', 'statuscopies')) and not opts.get('no_status'): copy = copies.pathcopies(repo[node1], repo[node2], m) + ui.pager('status') fm = ui.formatter('status', opts) fmt = '%s' + end showchar = not opts.get('no_status') @@ -5976,6 +4795,7 @@ Returns 0 on success. """ + ui.pager('summary') ctx = repo[None] parents = ctx.parents() pnode = parents[0].node() @@ -5996,7 +4816,7 @@ # label with log.changeset (instead of log.parent) since this # shows a working directory parent *changeset*: # i18n: column positioning for "hg summary" - ui.write(_('parent: %d:%s ') % (p.rev(), str(p)), + ui.write(_('parent: %d:%s ') % (p.rev(), p), label=cmdutil._changesetlabels(p)) ui.write(' '.join(p.tags()), label='log.tag') if p.bookmarks(): @@ -6368,6 +5188,7 @@ Returns 0 on success. """ + ui.pager('tags') fm = ui.formatter('tags', opts) hexfunc = fm.hexfunc tagtype = "" @@ -6464,12 +5285,13 @@ @command('^update|up|checkout|co', [('C', 'clean', None, _('discard uncommitted changes (no backup)')), ('c', 'check', None, _('require clean working directory')), + ('m', 'merge', None, _('merge uncommitted changes')), ('d', 'date', '', _('tipmost revision matching date'), _('DATE')), ('r', 'rev', '', _('revision'), _('REV')) ] + mergetoolopts, - _('[-c] [-C] [-d DATE] [[-r] REV]')) + _('[-C|-c|-m] [-d DATE] [[-r] REV]')) def update(ui, repo, node=None, rev=None, clean=False, date=None, check=False, - tool=None): + merge=None, tool=None): """update working directory (or switch revisions) Update the repository's working directory to the specified @@ -6488,10 +5310,11 @@ .. container:: verbose - The following rules apply when the working directory contains - uncommitted changes: - - 1. If neither -c/--check nor -C/--clean is specified, and if + The -C/--clean, -c/--check, and -m/--merge options control what + happens if the working directory contains uncommitted changes. + At most of one of them can be specified. + + 1. If no option is specified, and if the requested changeset is an ancestor or descendant of the working directory's parent, the uncommitted changes are merged into the requested changeset and the merged @@ -6500,10 +5323,14 @@ branch), the update is aborted and the uncommitted changes are preserved. - 2. With the -c/--check option, the update is aborted and the + 2. With the -m/--merge option, the update is allowed even if the + requested changeset is not an ancestor or descendant of + the working directory's parent. + + 3. With the -c/--check option, the update is aborted and the uncommitted changes are preserved. - 3. With the -C/--clean option, uncommitted changes are discarded and + 4. With the -C/--clean option, uncommitted changes are discarded and the working directory is updated to the requested changeset. To cancel an uncommitted merge (and lose your changes), use @@ -6522,14 +5349,26 @@ if rev and node: raise error.Abort(_("please specify just one revision")) + if ui.configbool('commands', 'update.requiredest'): + if not node and not rev and not date: + raise error.Abort(_('you must specify a destination'), + hint=_('for example: hg update ".::"')) + if rev is None or rev == '': rev = node if date and rev is not None: raise error.Abort(_("you can't specify a revision and a date")) - if check and clean: - raise error.Abort(_("cannot specify both -c/--check and -C/--clean")) + if len([x for x in (clean, check, merge) if x]) > 1: + raise error.Abort(_("can only specify one of -C/--clean, -c/--check, " + "or -m/merge")) + + updatecheck = None + if check: + updatecheck = 'abort' + elif merge: + updatecheck = 'none' with repo.wlock(): cmdutil.clearunfinished(repo) @@ -6541,12 +5380,10 @@ brev = rev rev = scmutil.revsingle(repo, rev, rev).rev() - if check: - cmdutil.bailifchanged(repo, merge=False) - repo.ui.setconfig('ui', 'forcemerge', tool, 'update') - return hg.updatetotally(ui, repo, rev, brev, clean=clean, check=check) + return hg.updatetotally(ui, repo, rev, brev, clean=clean, + updatecheck=updatecheck) @command('verify', []) def verify(ui, repo): @@ -6570,6 +5407,8 @@ @command('version', [] + formatteropts, norepo=True) def version_(ui, **opts): """output version and copyright information""" + if ui.verbose: + ui.pager('version') fm = ui.formatter("version", opts) fm.startitem() fm.write("ver", _("Mercurial Distributed SCM (version %s)\n"), diff -r ed5b25874d99 -r 4baf79a77afa mercurial/commandserver.py --- a/mercurial/commandserver.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/commandserver.py Fri Mar 24 08:37:26 2017 -0700 @@ -304,8 +304,8 @@ ui.flush() newfiles = [] nullfd = os.open(os.devnull, os.O_RDWR) - for f, sysf, mode in [(ui.fin, util.stdin, 'rb'), - (ui.fout, util.stdout, 'wb')]: + for f, sysf, mode in [(ui.fin, util.stdin, pycompat.sysstr('rb')), + (ui.fout, util.stdout, pycompat.sysstr('wb'))]: if f is sysf: newfd = os.dup(f.fileno()) os.dup2(nullfd, f.fileno()) @@ -447,6 +447,7 @@ self._sock = None self._oldsigchldhandler = None self._workerpids = set() # updated by signal handler; do not iterate + self._socketunlinked = None def init(self): self._sock = socket.socket(socket.AF_UNIX) @@ -455,11 +456,17 @@ o = signal.signal(signal.SIGCHLD, self._sigchldhandler) self._oldsigchldhandler = o self._servicehandler.printbanner(self.address) + self._socketunlinked = False + + def _unlinksocket(self): + if not self._socketunlinked: + self._servicehandler.unlinksocket(self.address) + self._socketunlinked = True def _cleanup(self): signal.signal(signal.SIGCHLD, self._oldsigchldhandler) self._sock.close() - self._servicehandler.unlinksocket(self.address) + self._unlinksocket() # don't kill child processes as they have active clients, just wait self._reapworkers(0) @@ -470,11 +477,23 @@ self._cleanup() def _mainloop(self): + exiting = False h = self._servicehandler - while not h.shouldexit(): + while True: + if not exiting and h.shouldexit(): + # clients can no longer connect() to the domain socket, so + # we stop queuing new requests. + # for requests that are queued (connect()-ed, but haven't been + # accept()-ed), handle them before exit. otherwise, clients + # waiting for recv() will receive ECONNRESET. + self._unlinksocket() + exiting = True try: ready = select.select([self._sock], [], [], h.pollinterval)[0] if not ready: + # only exit if we completed all queued requests + if exiting: + break continue conn, _addr = self._sock.accept() except (select.error, socket.error) as inst: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/config.py --- a/mercurial/config.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/config.py Fri Mar 24 08:37:26 2017 -0700 @@ -13,15 +13,16 @@ from .i18n import _ from . import ( error, + pycompat, util, ) class config(object): - def __init__(self, data=None, includepaths=[]): + def __init__(self, data=None, includepaths=None): self._data = {} self._source = {} self._unset = [] - self._includepaths = includepaths + self._includepaths = includepaths or [] if data: for k in data._data: self._data[k] = data[k].copy() @@ -69,6 +70,9 @@ def items(self, section): return self._data.get(section, {}).items() def set(self, section, item, value, source=""): + if pycompat.ispy3: + assert not isinstance(value, str), ( + 'config values may not be unicode strings on Python 3') if section not in self: self._data[section] = util.sortdict() self._data[section][item] = value @@ -169,5 +173,92 @@ def read(self, path, fp=None, sections=None, remap=None): if not fp: - fp = util.posixfile(path) - self.parse(path, fp.read(), sections, remap, self.read) + fp = util.posixfile(path, 'rb') + assert getattr(fp, 'mode', r'rb') == r'rb', ( + 'config files must be opened in binary mode, got fp=%r mode=%r' % ( + fp, fp.mode)) + self.parse(path, fp.read(), + sections=sections, remap=remap, include=self.read) + +def parselist(value): + """parse a configuration value as a list of comma/space separated strings + + >>> parselist('this,is "a small" ,test') + ['this', 'is', 'a small', 'test'] + """ + + def _parse_plain(parts, s, offset): + whitespace = False + while offset < len(s) and (s[offset:offset + 1].isspace() + or s[offset:offset + 1] == ','): + whitespace = True + offset += 1 + if offset >= len(s): + return None, parts, offset + if whitespace: + parts.append('') + if s[offset:offset + 1] == '"' and not parts[-1]: + return _parse_quote, parts, offset + 1 + elif s[offset:offset + 1] == '"' and parts[-1][-1] == '\\': + parts[-1] = parts[-1][:-1] + s[offset:offset + 1] + return _parse_plain, parts, offset + 1 + parts[-1] += s[offset:offset + 1] + return _parse_plain, parts, offset + 1 + + def _parse_quote(parts, s, offset): + if offset < len(s) and s[offset:offset + 1] == '"': # "" + parts.append('') + offset += 1 + while offset < len(s) and (s[offset:offset + 1].isspace() or + s[offset:offset + 1] == ','): + offset += 1 + return _parse_plain, parts, offset + + while offset < len(s) and s[offset:offset + 1] != '"': + if (s[offset:offset + 1] == '\\' and offset + 1 < len(s) + and s[offset + 1:offset + 2] == '"'): + offset += 1 + parts[-1] += '"' + else: + parts[-1] += s[offset:offset + 1] + offset += 1 + + if offset >= len(s): + real_parts = _configlist(parts[-1]) + if not real_parts: + parts[-1] = '"' + else: + real_parts[0] = '"' + real_parts[0] + parts = parts[:-1] + parts.extend(real_parts) + return None, parts, offset + + offset += 1 + while offset < len(s) and s[offset:offset + 1] in [' ', ',']: + offset += 1 + + if offset < len(s): + if offset + 1 == len(s) and s[offset:offset + 1] == '"': + parts[-1] += '"' + offset += 1 + else: + parts.append('') + else: + return None, parts, offset + + return _parse_plain, parts, offset + + def _configlist(s): + s = s.rstrip(' ,') + if not s: + return [] + parser, parts, offset = _parse_plain, [''], 0 + while parser: + parser, parts, offset = parser(parts, s, offset) + return parts + + if value is not None and isinstance(value, bytes): + result = _configlist(value.lstrip(' ,\n')) + else: + result = value + return result or [] diff -r ed5b25874d99 -r 4baf79a77afa mercurial/context.py --- a/mercurial/context.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/context.py Fri Mar 24 08:37:26 2017 -0700 @@ -18,11 +18,11 @@ bin, hex, modifiednodeid, - newnodeid, nullid, nullrev, short, wdirid, + wdirnodes, ) from . import ( encoding, @@ -33,6 +33,7 @@ obsolete as obsmod, patch, phases, + pycompat, repoview, revlog, scmutil, @@ -64,6 +65,12 @@ return o def __str__(self): + r = short(self.node()) + if pycompat.ispy3: + return r.decode('ascii') + return r + + def __bytes__(self): return short(self.node()) def __int__(self): @@ -90,14 +97,11 @@ def __iter__(self): return iter(self._manifest) - def _manifestmatches(self, match, s): - """generate a new manifest filtered by the match argument - - This method is for internal use only and mainly exists to provide an - object oriented way for other contexts to customize the manifest - generation. - """ - return self.manifest().matches(match) + def _buildstatusmanifest(self, status): + """Builds a manifest that includes the given status results, if this is + a working copy context. For non-working copy contexts, it just returns + the normal manifest.""" + return self.manifest() def _matchstatus(self, other, match): """return match.always if match is none @@ -116,17 +120,19 @@ # 1000 and cache it so that when you read 1001, we just need to apply a # delta to what's in the cache. So that's one full reconstruction + one # delta application. + mf2 = None if self.rev() is not None and self.rev() < other.rev(): - self.manifest() - mf1 = other._manifestmatches(match, s) - mf2 = self._manifestmatches(match, s) + mf2 = self._buildstatusmanifest(s) + mf1 = other._buildstatusmanifest(s) + if mf2 is None: + mf2 = self._buildstatusmanifest(s) modified, added = [], [] removed = [] clean = [] deleted, unknown, ignored = s.deleted, s.unknown, s.ignored deletedset = set(deleted) - d = mf1.diff(mf2, clean=listclean) + d = mf1.diff(mf2, match=match, clean=listclean) for fn, value in d.iteritems(): if fn in deletedset: continue @@ -140,7 +146,7 @@ removed.append(fn) elif flag1 != flag2: modified.append(fn) - elif node2 != newnodeid: + elif node2 not in wdirnodes: # When comparing files between two commits, we save time by # not comparing the file contents when the nodeids differ. # Note that this means we incorrectly report a reverted change @@ -153,8 +159,10 @@ if removed: # need to filter files if they are already reported as removed - unknown = [fn for fn in unknown if fn not in mf1] - ignored = [fn for fn in ignored if fn not in mf1] + unknown = [fn for fn in unknown if fn not in mf1 and + (not match or match(fn))] + ignored = [fn for fn in ignored if fn not in mf1 and + (not match or match(fn))] # if they're deleted, don't report them as removed removed = [fn for fn in removed if fn not in deletedset] @@ -290,8 +298,10 @@ ''' return subrepo.subrepo(self, path, allowwdir=True) - def match(self, pats=[], include=None, exclude=None, default='glob', + def match(self, pats=None, include=None, exclude=None, default='glob', listsubrepos=False, badfn=None): + if pats is None: + pats = [] r = self._repo return matchmod.match(r.root, r.getcwd(), pats, include, exclude, default, @@ -418,7 +428,7 @@ self._node = repo.changelog.node(changeid) self._rev = changeid return - if isinstance(changeid, long): + if not pycompat.ispy3 and isinstance(changeid, long): changeid = str(changeid) if changeid == 'null': self._node = nullid @@ -446,7 +456,7 @@ try: r = int(changeid) - if str(r) != changeid: + if '%d' % r != changeid: raise ValueError l = len(repo.changelog) if r < 0: @@ -524,6 +534,8 @@ def __nonzero__(self): return self._rev != nullrev + __bool__ = __nonzero__ + @propertycache def _changeset(self): return self._repo.changelog.changelogrevision(self.rev()) @@ -712,6 +724,8 @@ # file is missing return False + __bool__ = __nonzero__ + def __str__(self): try: return "%s@%s" % (self.path(), self._changectx) @@ -1166,7 +1180,7 @@ diffinrange = any(stype == '!' for _, stype in filteredblocks) return diffinrange, linerange1 -def blockancestors(fctx, fromline, toline): +def blockancestors(fctx, fromline, toline, followfirst=False): """Yield ancestors of `fctx` with respect to the block of lines within `fromline`-`toline` range. """ @@ -1175,9 +1189,11 @@ while visit: c, linerange2 = visit.pop(max(visit)) pl = c.parents() + if followfirst: + pl = pl[:1] if not pl: # The block originates from the initial revision. - yield c + yield c, linerange2 continue inrange = False for p in pl: @@ -1190,7 +1206,7 @@ continue visit[p.linkrev(), p.filenode()] = p, linerange1 if inrange: - yield c + yield c, linerange2 class committablectx(basectx): """A committablectx object provides common functionality for a context that @@ -1226,6 +1242,8 @@ def __nonzero__(self): return True + __bool__ = __nonzero__ + def _buildflagfunc(self): # Create a fallback function for getting file flags when the # filesystem doesn't support them @@ -1263,35 +1281,6 @@ return self._repo.dirstate.flagfunc(self._buildflagfunc) @propertycache - def _manifest(self): - """generate a manifest corresponding to the values in self._status - - This reuse the file nodeid from parent, but we append an extra letter - when modified. Modified files get an extra 'm' while added files get - an extra 'a'. This is used by manifests merge to see that files - are different and by update logic to avoid deleting newly added files. - """ - parents = self.parents() - - man = parents[0].manifest().copy() - - ff = self._flagfunc - for i, l in ((addednodeid, self._status.added), - (modifiednodeid, self._status.modified)): - for f in l: - man[f] = i - try: - man.setflag(f, ff(f)) - except OSError: - pass - - for f in self._status.deleted + self._status.removed: - if f in man: - del man[f] - - return man - - @propertycache def _status(self): return self._repo.status() @@ -1534,21 +1523,21 @@ self._repo.dirstate.normallookup(dest) self._repo.dirstate.copy(source, dest) - def match(self, pats=[], include=None, exclude=None, default='glob', + def match(self, pats=None, include=None, exclude=None, default='glob', listsubrepos=False, badfn=None): + if pats is None: + pats = [] r = self._repo # Only a case insensitive filesystem needs magic to translate user input # to actual case in the filesystem. + matcherfunc = matchmod.match if not util.fscasesensitive(r.root): - return matchmod.icasefsmatcher(r.root, r.getcwd(), pats, include, - exclude, default, r.auditor, self, - listsubrepos=listsubrepos, - badfn=badfn) - return matchmod.match(r.root, r.getcwd(), pats, - include, exclude, default, - auditor=r.auditor, ctx=self, - listsubrepos=listsubrepos, badfn=badfn) + matcherfunc = matchmod.icasefsmatcher + return matcherfunc(r.root, r.getcwd(), pats, + include, exclude, default, + auditor=r.auditor, ctx=self, + listsubrepos=listsubrepos, badfn=badfn) def _filtersuspectsymlink(self, files): if not files or self._repo.dirstate._checklink: @@ -1605,22 +1594,6 @@ pass return modified, fixup - def _manifestmatches(self, match, s): - """Slow path for workingctx - - The fast path is when we compare the working directory to its parent - which means this function is comparing with a non-parent; therefore we - need to build a manifest and return what matches. - """ - mf = self._repo['.']._manifestmatches(match, s) - for f in s.modified + s.added: - mf[f] = newnodeid - mf.setflag(f, self.flags(f)) - for f in s.removed: - if f in mf: - del mf[f] - return mf - def _dirstatestatus(self, match=None, ignored=False, clean=False, unknown=False): '''Gets the status from the dirstate -- internal use only.''' @@ -1652,6 +1625,39 @@ return s + @propertycache + def _manifest(self): + """generate a manifest corresponding to the values in self._status + + This reuse the file nodeid from parent, but we use special node + identifiers for added and modified files. This is used by manifests + merge to see that files are different and by update logic to avoid + deleting newly added files. + """ + return self._buildstatusmanifest(self._status) + + def _buildstatusmanifest(self, status): + """Builds a manifest that includes the given status results.""" + parents = self.parents() + + man = parents[0].manifest().copy() + + ff = self._flagfunc + for i, l in ((addednodeid, status.added), + (modifiednodeid, status.modified)): + for f in l: + man[f] = i + try: + man.setflag(f, ff(f)) + except OSError: + pass + + for f in status.deleted + status.removed: + if f in man: + del man[f] + + return man + def _buildstatus(self, other, s, match, listignored, listclean, listunknown): """build a status with respect to another context @@ -1711,6 +1717,8 @@ def __nonzero__(self): return True + __bool__ = __nonzero__ + def linkrev(self): # linked to self._changectx no matter if file is modified or not return self.rev() @@ -1779,7 +1787,7 @@ def remove(self, ignoremissing=False): """wraps unlink for a repo's working directory""" - util.unlinkpath(self._repo.wjoin(self._path), ignoremissing) + self._repo.wvfs.unlinkpath(self._path, ignoremissing=ignoremissing) def write(self, data, flags): """wraps repo.wwrite""" diff -r ed5b25874d99 -r 4baf79a77afa mercurial/copies.py --- a/mercurial/copies.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/copies.py Fri Mar 24 08:37:26 2017 -0700 @@ -149,10 +149,7 @@ """ ma = a.manifest() mb = b.manifest() - if match: - ma = ma.matches(match) - mb = mb.matches(match) - return mb.filesnotin(ma) + return mb.filesnotin(ma, match=match) def _forwardcopies(a, b, match=None): '''find {dst@b: src@a} copy mapping where a is an ancestor of b''' diff -r ed5b25874d99 -r 4baf79a77afa mercurial/crecord.py --- a/mercurial/crecord.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/crecord.py Fri Mar 24 08:37:26 2017 -0700 @@ -1375,7 +1375,8 @@ pass helpwin.refresh() try: - helpwin.getkey() + with self.ui.timeblockedsection('crecord'): + helpwin.getkey() except curses.error: pass @@ -1392,7 +1393,8 @@ self.stdscr.refresh() confirmwin.refresh() try: - response = chr(self.stdscr.getch()) + with self.ui.timeblockedsection('crecord'): + response = chr(self.stdscr.getch()) except ValueError: response = None @@ -1412,7 +1414,8 @@ are you sure you want to review/edit and confirm the selected changes [yn]? """) - response = self.confirmationwindow(confirmtext) + with self.ui.timeblockedsection('crecord'): + response = self.confirmationwindow(confirmtext) if response is None: response = "n" if response.lower().startswith("y"): @@ -1655,7 +1658,8 @@ while True: self.updatescreen() try: - keypressed = self.statuswin.getkey() + with self.ui.timeblockedsection('crecord'): + keypressed = self.statuswin.getkey() if self.errorstr is not None: self.errorstr = None continue diff -r ed5b25874d99 -r 4baf79a77afa mercurial/debugcommands.py --- a/mercurial/debugcommands.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/debugcommands.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,41 +7,63 @@ from __future__ import absolute_import +import difflib +import errno import operator import os import random +import socket +import string +import sys +import tempfile +import time from .i18n import _ from .node import ( bin, hex, + nullhex, nullid, + nullrev, short, ) from . import ( bundle2, changegroup, cmdutil, + color, commands, context, dagparser, dagutil, + encoding, error, exchange, extensions, fileset, + formatter, hg, localrepo, lock as lockmod, + merge as mergemod, + obsolete, + policy, + pvec, pycompat, repair, revlog, + revset, + revsetlang, scmutil, setdiscovery, simplemerge, + smartset, + sslutil, streamclone, + templater, treediscovery, util, + vfs as vfsmod, ) release = lockmod.release @@ -55,7 +77,7 @@ """find the ancestor revision of two revisions in a given index""" if len(args) == 3: index, rev1, rev2 = args - r = revlog.revlog(scmutil.opener(pycompat.getcwd(), audit=False), index) + r = revlog.revlog(vfsmod.vfs(pycompat.getcwd(), audit=False), index) lookup = r.lookup elif len(args) == 2: if not repo: @@ -324,6 +346,47 @@ error = _(".hg/dirstate inconsistent with current parent's manifest") raise error.Abort(error) +@command('debugcolor', + [('', 'style', None, _('show all configured styles'))], + 'hg debugcolor') +def debugcolor(ui, repo, **opts): + """show available color, effects or style""" + ui.write(('color mode: %s\n') % ui._colormode) + if opts.get('style'): + return _debugdisplaystyle(ui) + else: + return _debugdisplaycolor(ui) + +def _debugdisplaycolor(ui): + ui = ui.copy() + ui._styles.clear() + for effect in color._effects.keys(): + ui._styles[effect] = effect + if ui._terminfoparams: + for k, v in ui.configitems('color'): + if k.startswith('color.'): + ui._styles[k] = k[6:] + elif k.startswith('terminfo.'): + ui._styles[k] = k[9:] + ui.write(_('available colors:\n')) + # sort label with a '_' after the other to group '_background' entry. + items = sorted(ui._styles.items(), + key=lambda i: ('_' in i[0], i[0], i[1])) + for colorname, label in items: + ui.write(('%s\n') % colorname, label=label) + +def _debugdisplaystyle(ui): + ui.write(_('available style:\n')) + width = max(len(s) for s in ui._styles) + for label, effects in sorted(ui._styles.items()): + ui.write('%s' % label, label=label) + if effects: + # 50 + ui.write(': ') + ui.write(' ' * (max(0, width - len(label)))) + ui.write(', '.join(ui.label(e, e) for e in effects.split())) + ui.write('\n') + @command('debugcommands', [], _('[COMMAND]'), norepo=True) def debugcommands(ui, cmd='', *args): """list all available commands and options""" @@ -390,7 +453,7 @@ spaces = opts.get('spaces') dots = opts.get('dots') if file_: - rlog = revlog.revlog(scmutil.opener(pycompat.getcwd(), audit=False), + rlog = revlog.revlog(vfsmod.vfs(pycompat.getcwd(), audit=False), file_) revs = set((int(r) for r in revs)) def events(): @@ -567,6 +630,37 @@ fm.end() +@command('debugdirstate|debugstate', + [('', 'nodates', None, _('do not display the saved mtime')), + ('', 'datesort', None, _('sort by saved mtime'))], + _('[OPTION]...')) +def debugstate(ui, repo, **opts): + """show the contents of the current dirstate""" + + nodates = opts.get('nodates') + datesort = opts.get('datesort') + + timestr = "" + if datesort: + keyfunc = lambda x: (x[1][3], x[0]) # sort by mtime, then by filename + else: + keyfunc = None # sort by filename + for file_, ent in sorted(repo.dirstate._map.iteritems(), key=keyfunc): + if ent[3] == -1: + timestr = 'unset ' + elif nodates: + timestr = 'set ' + else: + timestr = time.strftime("%Y-%m-%d %H:%M:%S ", + time.localtime(ent[3])) + if ent[1] & 0o20000: + mode = 'lnk' + else: + mode = '%3o' % (ent[1] & 0o777 & ~util.umask) + ui.write("%c %s %10d %s%s\n" % (ent[0], mode, ent[2], timestr, file_)) + for f in repo.dirstate.copies(): + ui.write(_("copy: %s -> %s\n") % (repo.dirstate.copied(f), f)) + @command('debugdiscovery', [('', 'old', None, _('use old-style discovery')), ('', 'nonheads', None, @@ -641,7 +735,7 @@ fm = ui.formatter('debugextensions', opts) for extname, extmod in sorted(exts, key=operator.itemgetter(0)): isinternal = extensions.ismoduleinternal(extmod) - extsource = extmod.__file__ + extsource = pycompat.fsencode(extmod.__file__) if isinternal: exttestedwith = [] # never expose magic string to users else: @@ -696,11 +790,12 @@ """show information detected about current filesystem""" util.writefile('.debugfsinfo', '') ui.write(('exec: %s\n') % (util.checkexec(path) and 'yes' or 'no')) + ui.write(('fstype: %s\n') % (util.getfstype('.') or '(unknown)')) ui.write(('symlink: %s\n') % (util.checklink(path) and 'yes' or 'no')) ui.write(('hardlink: %s\n') % (util.checknlink(path) and 'yes' or 'no')) ui.write(('case-sensitive: %s\n') % (util.fscasesensitive('.debugfsinfo') and 'yes' or 'no')) - os.unlink('.debugfsinfo') + util.tryunlink('.debugfsinfo') @command('debuggetbundle', [('H', 'head', [], _('id of head node'), _('ID')), @@ -851,6 +946,1106 @@ ui.write("\t%d -> %d\n" % (r.rev(pp[1]), i)) ui.write("}\n") +@command('debuginstall', [] + commands.formatteropts, '', norepo=True) +def debuginstall(ui, **opts): + '''test Mercurial installation + + Returns 0 on success. + ''' + + def writetemp(contents): + (fd, name) = tempfile.mkstemp(prefix="hg-debuginstall-") + f = os.fdopen(fd, pycompat.sysstr("wb")) + f.write(contents) + f.close() + return name + + problems = 0 + + fm = ui.formatter('debuginstall', opts) + fm.startitem() + + # encoding + fm.write('encoding', _("checking encoding (%s)...\n"), encoding.encoding) + err = None + try: + encoding.fromlocal("test") + except error.Abort as inst: + err = inst + problems += 1 + fm.condwrite(err, 'encodingerror', _(" %s\n" + " (check that your locale is properly set)\n"), err) + + # Python + fm.write('pythonexe', _("checking Python executable (%s)\n"), + pycompat.sysexecutable) + fm.write('pythonver', _("checking Python version (%s)\n"), + ("%d.%d.%d" % sys.version_info[:3])) + fm.write('pythonlib', _("checking Python lib (%s)...\n"), + os.path.dirname(pycompat.fsencode(os.__file__))) + + security = set(sslutil.supportedprotocols) + if sslutil.hassni: + security.add('sni') + + fm.write('pythonsecurity', _("checking Python security support (%s)\n"), + fm.formatlist(sorted(security), name='protocol', + fmt='%s', sep=',')) + + # These are warnings, not errors. So don't increment problem count. This + # may change in the future. + if 'tls1.2' not in security: + fm.plain(_(' TLS 1.2 not supported by Python install; ' + 'network connections lack modern security\n')) + if 'sni' not in security: + fm.plain(_(' SNI not supported by Python install; may have ' + 'connectivity issues with some servers\n')) + + # TODO print CA cert info + + # hg version + hgver = util.version() + fm.write('hgver', _("checking Mercurial version (%s)\n"), + hgver.split('+')[0]) + fm.write('hgverextra', _("checking Mercurial custom build (%s)\n"), + '+'.join(hgver.split('+')[1:])) + + # compiled modules + fm.write('hgmodulepolicy', _("checking module policy (%s)\n"), + policy.policy) + fm.write('hgmodules', _("checking installed modules (%s)...\n"), + os.path.dirname(pycompat.fsencode(__file__))) + + err = None + try: + from . import ( + base85, + bdiff, + mpatch, + osutil, + ) + dir(bdiff), dir(mpatch), dir(base85), dir(osutil) # quiet pyflakes + except Exception as inst: + err = inst + problems += 1 + fm.condwrite(err, 'extensionserror', " %s\n", err) + + compengines = util.compengines._engines.values() + fm.write('compengines', _('checking registered compression engines (%s)\n'), + fm.formatlist(sorted(e.name() for e in compengines), + name='compengine', fmt='%s', sep=', ')) + fm.write('compenginesavail', _('checking available compression engines ' + '(%s)\n'), + fm.formatlist(sorted(e.name() for e in compengines + if e.available()), + name='compengine', fmt='%s', sep=', ')) + wirecompengines = util.compengines.supportedwireengines(util.SERVERROLE) + fm.write('compenginesserver', _('checking available compression engines ' + 'for wire protocol (%s)\n'), + fm.formatlist([e.name() for e in wirecompengines + if e.wireprotosupport()], + name='compengine', fmt='%s', sep=', ')) + + # templates + p = templater.templatepaths() + fm.write('templatedirs', 'checking templates (%s)...\n', ' '.join(p)) + fm.condwrite(not p, '', _(" no template directories found\n")) + if p: + m = templater.templatepath("map-cmdline.default") + if m: + # template found, check if it is working + err = None + try: + templater.templater.frommapfile(m) + except Exception as inst: + err = inst + p = None + fm.condwrite(err, 'defaulttemplateerror', " %s\n", err) + else: + p = None + fm.condwrite(p, 'defaulttemplate', + _("checking default template (%s)\n"), m) + fm.condwrite(not m, 'defaulttemplatenotfound', + _(" template '%s' not found\n"), "default") + if not p: + problems += 1 + fm.condwrite(not p, '', + _(" (templates seem to have been installed incorrectly)\n")) + + # editor + editor = ui.geteditor() + editor = util.expandpath(editor) + fm.write('editor', _("checking commit editor... (%s)\n"), editor) + cmdpath = util.findexe(pycompat.shlexsplit(editor)[0]) + fm.condwrite(not cmdpath and editor == 'vi', 'vinotfound', + _(" No commit editor set and can't find %s in PATH\n" + " (specify a commit editor in your configuration" + " file)\n"), not cmdpath and editor == 'vi' and editor) + fm.condwrite(not cmdpath and editor != 'vi', 'editornotfound', + _(" Can't find editor '%s' in PATH\n" + " (specify a commit editor in your configuration" + " file)\n"), not cmdpath and editor) + if not cmdpath and editor != 'vi': + problems += 1 + + # check username + username = None + err = None + try: + username = ui.username() + except error.Abort as e: + err = e + problems += 1 + + fm.condwrite(username, 'username', _("checking username (%s)\n"), username) + fm.condwrite(err, 'usernameerror', _("checking username...\n %s\n" + " (specify a username in your configuration file)\n"), err) + + fm.condwrite(not problems, '', + _("no problems detected\n")) + if not problems: + fm.data(problems=problems) + fm.condwrite(problems, 'problems', + _("%d problems detected," + " please check your install!\n"), problems) + fm.end() + + return problems + +@command('debugknown', [], _('REPO ID...'), norepo=True) +def debugknown(ui, repopath, *ids, **opts): + """test whether node ids are known to a repo + + Every ID must be a full-length hex node id string. Returns a list of 0s + and 1s indicating unknown/known. + """ + repo = hg.peer(ui, opts, repopath) + if not repo.capable('known'): + raise error.Abort("known() not supported by target repository") + flags = repo.known([bin(s) for s in ids]) + ui.write("%s\n" % ("".join([f and "1" or "0" for f in flags]))) + +@command('debuglabelcomplete', [], _('LABEL...')) +def debuglabelcomplete(ui, repo, *args): + '''backwards compatibility with old bash completion scripts (DEPRECATED)''' + debugnamecomplete(ui, repo, *args) + +@command('debuglocks', + [('L', 'force-lock', None, _('free the store lock (DANGEROUS)')), + ('W', 'force-wlock', None, + _('free the working state lock (DANGEROUS)'))], + _('[OPTION]...')) +def debuglocks(ui, repo, **opts): + """show or modify state of locks + + By default, this command will show which locks are held. This + includes the user and process holding the lock, the amount of time + the lock has been held, and the machine name where the process is + running if it's not local. + + Locks protect the integrity of Mercurial's data, so should be + treated with care. System crashes or other interruptions may cause + locks to not be properly released, though Mercurial will usually + detect and remove such stale locks automatically. + + However, detecting stale locks may not always be possible (for + instance, on a shared filesystem). Removing locks may also be + blocked by filesystem permissions. + + Returns 0 if no locks are held. + + """ + + if opts.get('force_lock'): + repo.svfs.unlink('lock') + if opts.get('force_wlock'): + repo.vfs.unlink('wlock') + if opts.get('force_lock') or opts.get('force_lock'): + return 0 + + now = time.time() + held = 0 + + def report(vfs, name, method): + # this causes stale locks to get reaped for more accurate reporting + try: + l = method(False) + except error.LockHeld: + l = None + + if l: + l.release() + else: + try: + stat = vfs.lstat(name) + age = now - stat.st_mtime + user = util.username(stat.st_uid) + locker = vfs.readlock(name) + if ":" in locker: + host, pid = locker.split(':') + if host == socket.gethostname(): + locker = 'user %s, process %s' % (user, pid) + else: + locker = 'user %s, process %s, host %s' \ + % (user, pid, host) + ui.write(("%-6s %s (%ds)\n") % (name + ":", locker, age)) + return 1 + except OSError as e: + if e.errno != errno.ENOENT: + raise + + ui.write(("%-6s free\n") % (name + ":")) + return 0 + + held += report(repo.svfs, "lock", repo.lock) + held += report(repo.vfs, "wlock", repo.wlock) + + return held + +@command('debugmergestate', [], '') +def debugmergestate(ui, repo, *args): + """print merge state + + Use --verbose to print out information about whether v1 or v2 merge state + was chosen.""" + def _hashornull(h): + if h == nullhex: + return 'null' + else: + return h + + def printrecords(version): + ui.write(('* version %s records\n') % version) + if version == 1: + records = v1records + else: + records = v2records + + for rtype, record in records: + # pretty print some record types + if rtype == 'L': + ui.write(('local: %s\n') % record) + elif rtype == 'O': + ui.write(('other: %s\n') % record) + elif rtype == 'm': + driver, mdstate = record.split('\0', 1) + ui.write(('merge driver: %s (state "%s")\n') + % (driver, mdstate)) + elif rtype in 'FDC': + r = record.split('\0') + f, state, hash, lfile, afile, anode, ofile = r[0:7] + if version == 1: + onode = 'not stored in v1 format' + flags = r[7] + else: + onode, flags = r[7:9] + ui.write(('file: %s (record type "%s", state "%s", hash %s)\n') + % (f, rtype, state, _hashornull(hash))) + ui.write((' local path: %s (flags "%s")\n') % (lfile, flags)) + ui.write((' ancestor path: %s (node %s)\n') + % (afile, _hashornull(anode))) + ui.write((' other path: %s (node %s)\n') + % (ofile, _hashornull(onode))) + elif rtype == 'f': + filename, rawextras = record.split('\0', 1) + extras = rawextras.split('\0') + i = 0 + extrastrings = [] + while i < len(extras): + extrastrings.append('%s = %s' % (extras[i], extras[i + 1])) + i += 2 + + ui.write(('file extras: %s (%s)\n') + % (filename, ', '.join(extrastrings))) + elif rtype == 'l': + labels = record.split('\0', 2) + labels = [l for l in labels if len(l) > 0] + ui.write(('labels:\n')) + ui.write((' local: %s\n' % labels[0])) + ui.write((' other: %s\n' % labels[1])) + if len(labels) > 2: + ui.write((' base: %s\n' % labels[2])) + else: + ui.write(('unrecognized entry: %s\t%s\n') + % (rtype, record.replace('\0', '\t'))) + + # Avoid mergestate.read() since it may raise an exception for unsupported + # merge state records. We shouldn't be doing this, but this is OK since this + # command is pretty low-level. + ms = mergemod.mergestate(repo) + + # sort so that reasonable information is on top + v1records = ms._readrecordsv1() + v2records = ms._readrecordsv2() + order = 'LOml' + def key(r): + idx = order.find(r[0]) + if idx == -1: + return (1, r[1]) + else: + return (0, idx) + v1records.sort(key=key) + v2records.sort(key=key) + + if not v1records and not v2records: + ui.write(('no merge state found\n')) + elif not v2records: + ui.note(('no version 2 merge state\n')) + printrecords(1) + elif ms._v1v2match(v1records, v2records): + ui.note(('v1 and v2 states match: using v2\n')) + printrecords(2) + else: + ui.note(('v1 and v2 states mismatch: using v1\n')) + printrecords(1) + if ui.verbose: + printrecords(2) + +@command('debugnamecomplete', [], _('NAME...')) +def debugnamecomplete(ui, repo, *args): + '''complete "names" - tags, open branch names, bookmark names''' + + names = set() + # since we previously only listed open branches, we will handle that + # specially (after this for loop) + for name, ns in repo.names.iteritems(): + if name != 'branches': + names.update(ns.listnames(repo)) + names.update(tag for (tag, heads, tip, closed) + in repo.branchmap().iterbranches() if not closed) + completions = set() + if not args: + args = [''] + for a in args: + completions.update(n for n in names if n.startswith(a)) + ui.write('\n'.join(sorted(completions))) + ui.write('\n') + +@command('debugobsolete', + [('', 'flags', 0, _('markers flag')), + ('', 'record-parents', False, + _('record parent information for the precursor')), + ('r', 'rev', [], _('display markers relevant to REV')), + ('', 'index', False, _('display index of the marker')), + ('', 'delete', [], _('delete markers specified by indices')), + ] + commands.commitopts2 + commands.formatteropts, + _('[OBSOLETED [REPLACEMENT ...]]')) +def debugobsolete(ui, repo, precursor=None, *successors, **opts): + """create arbitrary obsolete marker + + With no arguments, displays the list of obsolescence markers.""" + + def parsenodeid(s): + try: + # We do not use revsingle/revrange functions here to accept + # arbitrary node identifiers, possibly not present in the + # local repository. + n = bin(s) + if len(n) != len(nullid): + raise TypeError() + return n + except TypeError: + raise error.Abort('changeset references must be full hexadecimal ' + 'node identifiers') + + if opts.get('delete'): + indices = [] + for v in opts.get('delete'): + try: + indices.append(int(v)) + except ValueError: + raise error.Abort(_('invalid index value: %r') % v, + hint=_('use integers for indices')) + + if repo.currenttransaction(): + raise error.Abort(_('cannot delete obsmarkers in the middle ' + 'of transaction.')) + + with repo.lock(): + n = repair.deleteobsmarkers(repo.obsstore, indices) + ui.write(_('deleted %i obsolescence markers\n') % n) + + return + + if precursor is not None: + if opts['rev']: + raise error.Abort('cannot select revision when creating marker') + metadata = {} + metadata['user'] = opts['user'] or ui.username() + succs = tuple(parsenodeid(succ) for succ in successors) + l = repo.lock() + try: + tr = repo.transaction('debugobsolete') + try: + date = opts.get('date') + if date: + date = util.parsedate(date) + else: + date = None + prec = parsenodeid(precursor) + parents = None + if opts['record_parents']: + if prec not in repo.unfiltered(): + raise error.Abort('cannot used --record-parents on ' + 'unknown changesets') + parents = repo.unfiltered()[prec].parents() + parents = tuple(p.node() for p in parents) + repo.obsstore.create(tr, prec, succs, opts['flags'], + parents=parents, date=date, + metadata=metadata) + tr.close() + except ValueError as exc: + raise error.Abort(_('bad obsmarker input: %s') % exc) + finally: + tr.release() + finally: + l.release() + else: + if opts['rev']: + revs = scmutil.revrange(repo, opts['rev']) + nodes = [repo[r].node() for r in revs] + markers = list(obsolete.getmarkers(repo, nodes=nodes)) + markers.sort(key=lambda x: x._data) + else: + markers = obsolete.getmarkers(repo) + + markerstoiter = markers + isrelevant = lambda m: True + if opts.get('rev') and opts.get('index'): + markerstoiter = obsolete.getmarkers(repo) + markerset = set(markers) + isrelevant = lambda m: m in markerset + + fm = ui.formatter('debugobsolete', opts) + for i, m in enumerate(markerstoiter): + if not isrelevant(m): + # marker can be irrelevant when we're iterating over a set + # of markers (markerstoiter) which is bigger than the set + # of markers we want to display (markers) + # this can happen if both --index and --rev options are + # provided and thus we need to iterate over all of the markers + # to get the correct indices, but only display the ones that + # are relevant to --rev value + continue + fm.startitem() + ind = i if opts.get('index') else None + cmdutil.showmarker(fm, m, index=ind) + fm.end() + +@command('debugpathcomplete', + [('f', 'full', None, _('complete an entire path')), + ('n', 'normal', None, _('show only normal files')), + ('a', 'added', None, _('show only added files')), + ('r', 'removed', None, _('show only removed files'))], + _('FILESPEC...')) +def debugpathcomplete(ui, repo, *specs, **opts): + '''complete part or all of a tracked path + + This command supports shells that offer path name completion. It + currently completes only files already known to the dirstate. + + Completion extends only to the next path segment unless + --full is specified, in which case entire paths are used.''' + + def complete(path, acceptable): + dirstate = repo.dirstate + spec = os.path.normpath(os.path.join(pycompat.getcwd(), path)) + rootdir = repo.root + pycompat.ossep + if spec != repo.root and not spec.startswith(rootdir): + return [], [] + if os.path.isdir(spec): + spec += '/' + spec = spec[len(rootdir):] + fixpaths = pycompat.ossep != '/' + if fixpaths: + spec = spec.replace(pycompat.ossep, '/') + speclen = len(spec) + fullpaths = opts['full'] + files, dirs = set(), set() + adddir, addfile = dirs.add, files.add + for f, st in dirstate.iteritems(): + if f.startswith(spec) and st[0] in acceptable: + if fixpaths: + f = f.replace('/', pycompat.ossep) + if fullpaths: + addfile(f) + continue + s = f.find(pycompat.ossep, speclen) + if s >= 0: + adddir(f[:s]) + else: + addfile(f) + return files, dirs + + acceptable = '' + if opts['normal']: + acceptable += 'nm' + if opts['added']: + acceptable += 'a' + if opts['removed']: + acceptable += 'r' + cwd = repo.getcwd() + if not specs: + specs = ['.'] + + files, dirs = set(), set() + for spec in specs: + f, d = complete(spec, acceptable or 'nmar') + files.update(f) + dirs.update(d) + files.update(dirs) + ui.write('\n'.join(repo.pathto(p, cwd) for p in sorted(files))) + ui.write('\n') + +@command('debugpushkey', [], _('REPO NAMESPACE [KEY OLD NEW]'), norepo=True) +def debugpushkey(ui, repopath, namespace, *keyinfo, **opts): + '''access the pushkey key/value protocol + + With two args, list the keys in the given namespace. + + With five args, set a key to new if it currently is set to old. + Reports success or failure. + ''' + + target = hg.peer(ui, {}, repopath) + if keyinfo: + key, old, new = keyinfo + r = target.pushkey(namespace, key, old, new) + ui.status(str(r) + '\n') + return not r + else: + for k, v in sorted(target.listkeys(namespace).iteritems()): + ui.write("%s\t%s\n" % (util.escapestr(k), + util.escapestr(v))) + +@command('debugpvec', [], _('A B')) +def debugpvec(ui, repo, a, b=None): + ca = scmutil.revsingle(repo, a) + cb = scmutil.revsingle(repo, b) + pa = pvec.ctxpvec(ca) + pb = pvec.ctxpvec(cb) + if pa == pb: + rel = "=" + elif pa > pb: + rel = ">" + elif pa < pb: + rel = "<" + elif pa | pb: + rel = "|" + ui.write(_("a: %s\n") % pa) + ui.write(_("b: %s\n") % pb) + ui.write(_("depth(a): %d depth(b): %d\n") % (pa._depth, pb._depth)) + ui.write(_("delta: %d hdist: %d distance: %d relation: %s\n") % + (abs(pa._depth - pb._depth), pvec._hamming(pa._vec, pb._vec), + pa.distance(pb), rel)) + +@command('debugrebuilddirstate|debugrebuildstate', + [('r', 'rev', '', _('revision to rebuild to'), _('REV')), + ('', 'minimal', None, _('only rebuild files that are inconsistent with ' + 'the working copy parent')), + ], + _('[-r REV]')) +def debugrebuilddirstate(ui, repo, rev, **opts): + """rebuild the dirstate as it would look like for the given revision + + If no revision is specified the first current parent will be used. + + The dirstate will be set to the files of the given revision. + The actual working directory content or existing dirstate + information such as adds or removes is not considered. + + ``minimal`` will only rebuild the dirstate status for files that claim to be + tracked but are not in the parent manifest, or that exist in the parent + manifest but are not in the dirstate. It will not change adds, removes, or + modified files that are in the working copy parent. + + One use of this command is to make the next :hg:`status` invocation + check the actual file content. + """ + ctx = scmutil.revsingle(repo, rev) + with repo.wlock(): + dirstate = repo.dirstate + changedfiles = None + # See command doc for what minimal does. + if opts.get('minimal'): + manifestfiles = set(ctx.manifest().keys()) + dirstatefiles = set(dirstate) + manifestonly = manifestfiles - dirstatefiles + dsonly = dirstatefiles - manifestfiles + dsnotadded = set(f for f in dsonly if dirstate[f] != 'a') + changedfiles = manifestonly | dsnotadded + + dirstate.rebuild(ctx.node(), ctx.manifest(), changedfiles) + +@command('debugrebuildfncache', [], '') +def debugrebuildfncache(ui, repo): + """rebuild the fncache file""" + repair.rebuildfncache(ui, repo) + +@command('debugrename', + [('r', 'rev', '', _('revision to debug'), _('REV'))], + _('[-r REV] FILE')) +def debugrename(ui, repo, file1, *pats, **opts): + """dump rename information""" + + ctx = scmutil.revsingle(repo, opts.get('rev')) + m = scmutil.match(ctx, (file1,) + pats, opts) + for abs in ctx.walk(m): + fctx = ctx[abs] + o = fctx.filelog().renamed(fctx.filenode()) + rel = m.rel(abs) + if o: + ui.write(_("%s renamed from %s:%s\n") % (rel, o[0], hex(o[1]))) + else: + ui.write(_("%s not renamed\n") % rel) + +@command('debugrevlog', commands.debugrevlogopts + + [('d', 'dump', False, _('dump index data'))], + _('-c|-m|FILE'), + optionalrepo=True) +def debugrevlog(ui, repo, file_=None, **opts): + """show data and statistics about a revlog""" + r = cmdutil.openrevlog(repo, 'debugrevlog', file_, opts) + + if opts.get("dump"): + numrevs = len(r) + ui.write(("# rev p1rev p2rev start end deltastart base p1 p2" + " rawsize totalsize compression heads chainlen\n")) + ts = 0 + heads = set() + + for rev in xrange(numrevs): + dbase = r.deltaparent(rev) + if dbase == -1: + dbase = rev + cbase = r.chainbase(rev) + clen = r.chainlen(rev) + p1, p2 = r.parentrevs(rev) + rs = r.rawsize(rev) + ts = ts + rs + heads -= set(r.parentrevs(rev)) + heads.add(rev) + try: + compression = ts / r.end(rev) + except ZeroDivisionError: + compression = 0 + ui.write("%5d %5d %5d %5d %5d %10d %4d %4d %4d %7d %9d " + "%11d %5d %8d\n" % + (rev, p1, p2, r.start(rev), r.end(rev), + r.start(dbase), r.start(cbase), + r.start(p1), r.start(p2), + rs, ts, compression, len(heads), clen)) + return 0 + + v = r.version + format = v & 0xFFFF + flags = [] + gdelta = False + if v & revlog.REVLOGNGINLINEDATA: + flags.append('inline') + if v & revlog.REVLOGGENERALDELTA: + gdelta = True + flags.append('generaldelta') + if not flags: + flags = ['(none)'] + + nummerges = 0 + numfull = 0 + numprev = 0 + nump1 = 0 + nump2 = 0 + numother = 0 + nump1prev = 0 + nump2prev = 0 + chainlengths = [] + + datasize = [None, 0, 0] + fullsize = [None, 0, 0] + deltasize = [None, 0, 0] + chunktypecounts = {} + chunktypesizes = {} + + def addsize(size, l): + if l[0] is None or size < l[0]: + l[0] = size + if size > l[1]: + l[1] = size + l[2] += size + + numrevs = len(r) + for rev in xrange(numrevs): + p1, p2 = r.parentrevs(rev) + delta = r.deltaparent(rev) + if format > 0: + addsize(r.rawsize(rev), datasize) + if p2 != nullrev: + nummerges += 1 + size = r.length(rev) + if delta == nullrev: + chainlengths.append(0) + numfull += 1 + addsize(size, fullsize) + else: + chainlengths.append(chainlengths[delta] + 1) + addsize(size, deltasize) + if delta == rev - 1: + numprev += 1 + if delta == p1: + nump1prev += 1 + elif delta == p2: + nump2prev += 1 + elif delta == p1: + nump1 += 1 + elif delta == p2: + nump2 += 1 + elif delta != nullrev: + numother += 1 + + # Obtain data on the raw chunks in the revlog. + chunk = r._chunkraw(rev, rev)[1] + if chunk: + chunktype = chunk[0] + else: + chunktype = 'empty' + + if chunktype not in chunktypecounts: + chunktypecounts[chunktype] = 0 + chunktypesizes[chunktype] = 0 + + chunktypecounts[chunktype] += 1 + chunktypesizes[chunktype] += size + + # Adjust size min value for empty cases + for size in (datasize, fullsize, deltasize): + if size[0] is None: + size[0] = 0 + + numdeltas = numrevs - numfull + numoprev = numprev - nump1prev - nump2prev + totalrawsize = datasize[2] + datasize[2] /= numrevs + fulltotal = fullsize[2] + fullsize[2] /= numfull + deltatotal = deltasize[2] + if numrevs - numfull > 0: + deltasize[2] /= numrevs - numfull + totalsize = fulltotal + deltatotal + avgchainlen = sum(chainlengths) / numrevs + maxchainlen = max(chainlengths) + compratio = 1 + if totalsize: + compratio = totalrawsize / totalsize + + basedfmtstr = '%%%dd\n' + basepcfmtstr = '%%%dd %s(%%5.2f%%%%)\n' + + def dfmtstr(max): + return basedfmtstr % len(str(max)) + def pcfmtstr(max, padding=0): + return basepcfmtstr % (len(str(max)), ' ' * padding) + + def pcfmt(value, total): + if total: + return (value, 100 * float(value) / total) + else: + return value, 100.0 + + ui.write(('format : %d\n') % format) + ui.write(('flags : %s\n') % ', '.join(flags)) + + ui.write('\n') + fmt = pcfmtstr(totalsize) + fmt2 = dfmtstr(totalsize) + ui.write(('revisions : ') + fmt2 % numrevs) + ui.write((' merges : ') + fmt % pcfmt(nummerges, numrevs)) + ui.write((' normal : ') + fmt % pcfmt(numrevs - nummerges, numrevs)) + ui.write(('revisions : ') + fmt2 % numrevs) + ui.write((' full : ') + fmt % pcfmt(numfull, numrevs)) + ui.write((' deltas : ') + fmt % pcfmt(numdeltas, numrevs)) + ui.write(('revision size : ') + fmt2 % totalsize) + ui.write((' full : ') + fmt % pcfmt(fulltotal, totalsize)) + ui.write((' deltas : ') + fmt % pcfmt(deltatotal, totalsize)) + + def fmtchunktype(chunktype): + if chunktype == 'empty': + return ' %s : ' % chunktype + elif chunktype in string.ascii_letters: + return ' 0x%s (%s) : ' % (hex(chunktype), chunktype) + else: + return ' 0x%s : ' % hex(chunktype) + + ui.write('\n') + ui.write(('chunks : ') + fmt2 % numrevs) + for chunktype in sorted(chunktypecounts): + ui.write(fmtchunktype(chunktype)) + ui.write(fmt % pcfmt(chunktypecounts[chunktype], numrevs)) + ui.write(('chunks size : ') + fmt2 % totalsize) + for chunktype in sorted(chunktypecounts): + ui.write(fmtchunktype(chunktype)) + ui.write(fmt % pcfmt(chunktypesizes[chunktype], totalsize)) + + ui.write('\n') + fmt = dfmtstr(max(avgchainlen, compratio)) + ui.write(('avg chain length : ') + fmt % avgchainlen) + ui.write(('max chain length : ') + fmt % maxchainlen) + ui.write(('compression ratio : ') + fmt % compratio) + + if format > 0: + ui.write('\n') + ui.write(('uncompressed data size (min/max/avg) : %d / %d / %d\n') + % tuple(datasize)) + ui.write(('full revision size (min/max/avg) : %d / %d / %d\n') + % tuple(fullsize)) + ui.write(('delta size (min/max/avg) : %d / %d / %d\n') + % tuple(deltasize)) + + if numdeltas > 0: + ui.write('\n') + fmt = pcfmtstr(numdeltas) + fmt2 = pcfmtstr(numdeltas, 4) + ui.write(('deltas against prev : ') + fmt % pcfmt(numprev, numdeltas)) + if numprev > 0: + ui.write((' where prev = p1 : ') + fmt2 % pcfmt(nump1prev, + numprev)) + ui.write((' where prev = p2 : ') + fmt2 % pcfmt(nump2prev, + numprev)) + ui.write((' other : ') + fmt2 % pcfmt(numoprev, + numprev)) + if gdelta: + ui.write(('deltas against p1 : ') + + fmt % pcfmt(nump1, numdeltas)) + ui.write(('deltas against p2 : ') + + fmt % pcfmt(nump2, numdeltas)) + ui.write(('deltas against other : ') + fmt % pcfmt(numother, + numdeltas)) + +@command('debugrevspec', + [('', 'optimize', None, + _('print parsed tree after optimizing (DEPRECATED)')), + ('p', 'show-stage', [], + _('print parsed tree at the given stage'), _('NAME')), + ('', 'no-optimized', False, _('evaluate tree without optimization')), + ('', 'verify-optimized', False, _('verify optimized result')), + ], + ('REVSPEC')) +def debugrevspec(ui, repo, expr, **opts): + """parse and apply a revision specification + + Use -p/--show-stage option to print the parsed tree at the given stages. + Use -p all to print tree at every stage. + + Use --verify-optimized to compare the optimized result with the unoptimized + one. Returns 1 if the optimized result differs. + """ + stages = [ + ('parsed', lambda tree: tree), + ('expanded', lambda tree: revsetlang.expandaliases(ui, tree)), + ('concatenated', revsetlang.foldconcat), + ('analyzed', revsetlang.analyze), + ('optimized', revsetlang.optimize), + ] + if opts['no_optimized']: + stages = stages[:-1] + if opts['verify_optimized'] and opts['no_optimized']: + raise error.Abort(_('cannot use --verify-optimized with ' + '--no-optimized')) + stagenames = set(n for n, f in stages) + + showalways = set() + showchanged = set() + if ui.verbose and not opts['show_stage']: + # show parsed tree by --verbose (deprecated) + showalways.add('parsed') + showchanged.update(['expanded', 'concatenated']) + if opts['optimize']: + showalways.add('optimized') + if opts['show_stage'] and opts['optimize']: + raise error.Abort(_('cannot use --optimize with --show-stage')) + if opts['show_stage'] == ['all']: + showalways.update(stagenames) + else: + for n in opts['show_stage']: + if n not in stagenames: + raise error.Abort(_('invalid stage name: %s') % n) + showalways.update(opts['show_stage']) + + treebystage = {} + printedtree = None + tree = revsetlang.parse(expr, lookup=repo.__contains__) + for n, f in stages: + treebystage[n] = tree = f(tree) + if n in showalways or (n in showchanged and tree != printedtree): + if opts['show_stage'] or n != 'parsed': + ui.write(("* %s:\n") % n) + ui.write(revsetlang.prettyformat(tree), "\n") + printedtree = tree + + if opts['verify_optimized']: + arevs = revset.makematcher(treebystage['analyzed'])(repo) + brevs = revset.makematcher(treebystage['optimized'])(repo) + if ui.verbose: + ui.note(("* analyzed set:\n"), smartset.prettyformat(arevs), "\n") + ui.note(("* optimized set:\n"), smartset.prettyformat(brevs), "\n") + arevs = list(arevs) + brevs = list(brevs) + if arevs == brevs: + return 0 + ui.write(('--- analyzed\n'), label='diff.file_a') + ui.write(('+++ optimized\n'), label='diff.file_b') + sm = difflib.SequenceMatcher(None, arevs, brevs) + for tag, alo, ahi, blo, bhi in sm.get_opcodes(): + if tag in ('delete', 'replace'): + for c in arevs[alo:ahi]: + ui.write('-%s\n' % c, label='diff.deleted') + if tag in ('insert', 'replace'): + for c in brevs[blo:bhi]: + ui.write('+%s\n' % c, label='diff.inserted') + if tag == 'equal': + for c in arevs[alo:ahi]: + ui.write(' %s\n' % c) + return 1 + + func = revset.makematcher(tree) + revs = func(repo) + if ui.verbose: + ui.note(("* set:\n"), smartset.prettyformat(revs), "\n") + for c in revs: + ui.write("%s\n" % c) + +@command('debugsetparents', [], _('REV1 [REV2]')) +def debugsetparents(ui, repo, rev1, rev2=None): + """manually set the parents of the current working directory + + This is useful for writing repository conversion tools, but should + be used with care. For example, neither the working directory nor the + dirstate is updated, so file status may be incorrect after running this + command. + + Returns 0 on success. + """ + + r1 = scmutil.revsingle(repo, rev1).node() + r2 = scmutil.revsingle(repo, rev2, 'null').node() + + with repo.wlock(): + repo.setparents(r1, r2) + +@command('debugsub', + [('r', 'rev', '', + _('revision to check'), _('REV'))], + _('[-r REV] [REV]')) +def debugsub(ui, repo, rev=None): + ctx = scmutil.revsingle(repo, rev, None) + for k, v in sorted(ctx.substate.items()): + ui.write(('path %s\n') % k) + ui.write((' source %s\n') % v[0]) + ui.write((' revision %s\n') % v[1]) + +@command('debugsuccessorssets', + [], + _('[REV]')) +def debugsuccessorssets(ui, repo, *revs): + """show set of successors for revision + + A successors set of changeset A is a consistent group of revisions that + succeed A. It contains non-obsolete changesets only. + + In most cases a changeset A has a single successors set containing a single + successor (changeset A replaced by A'). + + A changeset that is made obsolete with no successors are called "pruned". + Such changesets have no successors sets at all. + + A changeset that has been "split" will have a successors set containing + more than one successor. + + A changeset that has been rewritten in multiple different ways is called + "divergent". Such changesets have multiple successor sets (each of which + may also be split, i.e. have multiple successors). + + Results are displayed as follows:: + + + + + + + + Here rev2 has two possible (i.e. divergent) successors sets. The first + holds one element, whereas the second holds three (i.e. the changeset has + been split). + """ + # passed to successorssets caching computation from one call to another + cache = {} + ctx2str = str + node2str = short + if ui.debug(): + def ctx2str(ctx): + return ctx.hex() + node2str = hex + for rev in scmutil.revrange(repo, revs): + ctx = repo[rev] + ui.write('%s\n'% ctx2str(ctx)) + for succsset in obsolete.successorssets(repo, ctx.node(), cache): + if succsset: + ui.write(' ') + ui.write(node2str(succsset[0])) + for node in succsset[1:]: + ui.write(' ') + ui.write(node2str(node)) + ui.write('\n') + +@command('debugtemplate', + [('r', 'rev', [], _('apply template on changesets'), _('REV')), + ('D', 'define', [], _('define template keyword'), _('KEY=VALUE'))], + _('[-r REV]... [-D KEY=VALUE]... TEMPLATE'), + optionalrepo=True) +def debugtemplate(ui, repo, tmpl, **opts): + """parse and apply a template + + If -r/--rev is given, the template is processed as a log template and + applied to the given changesets. Otherwise, it is processed as a generic + template. + + Use --verbose to print the parsed tree. + """ + revs = None + if opts['rev']: + if repo is None: + raise error.RepoError(_('there is no Mercurial repository here ' + '(.hg not found)')) + revs = scmutil.revrange(repo, opts['rev']) + + props = {} + for d in opts['define']: + try: + k, v = (e.strip() for e in d.split('=', 1)) + if not k or k == 'ui': + raise ValueError + props[k] = v + except ValueError: + raise error.Abort(_('malformed keyword definition: %s') % d) + + if ui.verbose: + aliases = ui.configitems('templatealias') + tree = templater.parse(tmpl) + ui.note(templater.prettyformat(tree), '\n') + newtree = templater.expandaliases(tree, aliases) + if newtree != tree: + ui.note(("* expanded:\n"), templater.prettyformat(newtree), '\n') + + mapfile = None + if revs is None: + k = 'debugtemplate' + t = formatter.maketemplater(ui, k, tmpl) + ui.write(templater.stringify(t(k, ui=ui, **props))) + else: + displayer = cmdutil.changeset_templater(ui, repo, None, opts, tmpl, + mapfile, buffered=False) + for r in revs: + displayer.show(repo[r], **props) + displayer.close() + @command('debugupgraderepo', [ ('o', 'optimize', [], _('extra optimization to perform'), _('NAME')), ('', 'run', False, _('performs an upgrade')), @@ -875,3 +2070,43 @@ unable to access the repository should be low. """ return repair.upgraderepo(ui, repo, run=run, optimize=optimize) + +@command('debugwalk', commands.walkopts, _('[OPTION]... [FILE]...'), + inferrepo=True) +def debugwalk(ui, repo, *pats, **opts): + """show how files match on given patterns""" + m = scmutil.match(repo[None], pats, opts) + items = list(repo.walk(m)) + if not items: + return + f = lambda fn: fn + if ui.configbool('ui', 'slash') and pycompat.ossep != '/': + f = lambda fn: util.normpath(fn) + fmt = 'f %%-%ds %%-%ds %%s' % ( + max([len(abs) for abs in items]), + max([len(m.rel(abs)) for abs in items])) + for abs in items: + line = fmt % (abs, f(m.rel(abs)), m.exact(abs) and 'exact' or '') + ui.write("%s\n" % line.rstrip()) + +@command('debugwireargs', + [('', 'three', '', 'three'), + ('', 'four', '', 'four'), + ('', 'five', '', 'five'), + ] + commands.remoteopts, + _('REPO [OPTIONS]... [ONE [TWO]]'), + norepo=True) +def debugwireargs(ui, repopath, *vals, **opts): + repo = hg.peer(ui, opts, repopath) + for opt in commands.remoteopts: + del opts[opt[1]] + args = {} + for k, v in opts.iteritems(): + if v: + args[k] = v + # run twice to check that we don't mess up the stream for the next command + res1 = repo.debugwireargs(*vals, **args) + res2 = repo.debugwireargs(*vals, **args) + ui.write("%s\n" % res1) + if res1 != res2: + ui.warn("%s\n" % res2) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/destutil.py --- a/mercurial/destutil.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/destutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -12,37 +12,10 @@ bookmarks, error, obsolete, + scmutil, ) -def _destupdatevalidate(repo, rev, clean, check): - """validate that the destination comply to various rules - - This exists as its own function to help wrapping from extensions.""" - wc = repo[None] - p1 = wc.p1() - if not clean: - # Check that the update is linear. - # - # Mercurial do not allow update-merge for non linear pattern - # (that would be technically possible but was considered too confusing - # for user a long time ago) - # - # See mercurial.merge.update for details - if p1.rev() not in repo.changelog.ancestors([rev], inclusive=True): - dirty = wc.dirty(missing=True) - foreground = obsolete.foreground(repo, [p1.node()]) - if not repo[rev].node() in foreground: - if dirty: - msg = _("uncommitted changes") - hint = _("commit and merge, or update --clean to" - " discard changes") - raise error.UpdateAbort(msg, hint=hint) - elif not check: # destination is not a descendant. - msg = _("not a linear update") - hint = _("merge or update --check to force update") - raise error.UpdateAbort(msg, hint=hint) - -def _destupdateobs(repo, clean, check): +def _destupdateobs(repo, clean): """decide of an update destination from obsolescence markers""" node = None wc = repo[None] @@ -78,7 +51,7 @@ movemark = repo['.'].node() return node, movemark, None -def _destupdatebook(repo, clean, check): +def _destupdatebook(repo, clean): """decide on an update destination from active bookmark""" # we also move the active bookmark, if any activemark = None @@ -87,7 +60,7 @@ activemark = node return node, movemark, activemark -def _destupdatebranch(repo, clean, check): +def _destupdatebranch(repo, clean): """decide on an update destination from current branch This ignores closed branch heads. @@ -113,7 +86,7 @@ node = repo['.'].node() return node, movemark, None -def _destupdatebranchfallback(repo, clean, check): +def _destupdatebranchfallback(repo, clean): """decide on an update destination from closed heads in current branch""" wc = repo[None] currentbranch = wc.branch() @@ -143,7 +116,7 @@ 'branchfallback': _destupdatebranchfallback, } -def destupdate(repo, clean=False, check=False): +def destupdate(repo, clean=False): """destination for bare update operation return (rev, movemark, activemark) @@ -156,13 +129,11 @@ node = movemark = activemark = None for step in destupdatesteps: - node, movemark, activemark = destupdatestepmap[step](repo, clean, check) + node, movemark, activemark = destupdatestepmap[step](repo, clean) if node is not None: break rev = repo[node].rev() - _destupdatevalidate(repo, rev, clean, check) - return rev, movemark, activemark msgdestmerge = { @@ -372,9 +343,6 @@ def desthistedit(ui, repo): """Default base revision to edit for `hg histedit`.""" - # Avoid cycle: scmutil -> revset -> destutil - from . import scmutil - default = ui.config('histedit', 'defaultrev', histeditdefaultrevset) if default: revs = scmutil.revrange(repo, [default]) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/dirstate.py --- a/mercurial/dirstate.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/dirstate.py Fri Mar 24 08:37:26 2017 -0700 @@ -23,6 +23,7 @@ pathutil, pycompat, scmutil, + txnutil, util, ) @@ -54,26 +55,16 @@ def nonnormalentries(dmap): '''Compute the nonnormal dirstate entries from the dmap''' try: - return parsers.nonnormalentries(dmap) + return parsers.nonnormalotherparententries(dmap) except AttributeError: - return set(fname for fname, e in dmap.iteritems() - if e[0] != 'n' or e[3] == -1) - -def _trypending(root, vfs, filename): - '''Open file to be read according to HG_PENDING environment variable - - This opens '.pending' of specified 'filename' only when HG_PENDING - is equal to 'root'. - - This returns '(fp, is_pending_opened)' tuple. - ''' - if root == encoding.environ.get('HG_PENDING'): - try: - return (vfs('%s.pending' % filename), True) - except IOError as inst: - if inst.errno != errno.ENOENT: - raise - return (vfs(filename), False) + nonnorm = set() + otherparent = set() + for fname, e in dmap.iteritems(): + if e[0] != 'n' or e[3] == -1: + nonnorm.add(fname) + if e[0] == 'n' and e[2] == -2: + otherparent.add(fname) + return nonnorm, otherparent class dirstate(object): @@ -104,6 +95,7 @@ self._pendingfilename = '%s.pending' % self._filename self._plchangecallbacks = {} self._origpl = None + self._updatedfiles = set() # for consistent view between _pl() and _read() invocations self._pendingmode = None @@ -145,7 +137,15 @@ @propertycache def _nonnormalset(self): - return nonnormalentries(self._map) + nonnorm, otherparents = nonnormalentries(self._map) + self._otherparentset = otherparents + return nonnorm + + @propertycache + def _otherparentset(self): + nonnorm, otherparents = nonnormalentries(self._map) + self._nonnormalset = nonnorm + return otherparents @propertycache def _filefoldmap(self): @@ -355,7 +355,12 @@ self._pl = p1, p2 copies = {} if oldp2 != nullid and p2 == nullid: - for f, s in self._map.iteritems(): + candidatefiles = self._nonnormalset.union(self._otherparentset) + for f in candidatefiles: + s = self._map.get(f) + if s is None: + continue + # Discard 'm' markers when moving away from a merge state if s[0] == 'm': if f in self._copymap: @@ -385,7 +390,7 @@ raise def _opendirstatefile(self): - fp, mode = _trypending(self._root, self._opener, self._filename) + fp, mode = txnutil.trypending(self._root, self._opener, self._filename) if self._pendingmode is not None and self._pendingmode != mode: fp.close() raise error.Abort(_('working directory state may be ' @@ -441,11 +446,13 @@ def invalidate(self): for a in ("_map", "_copymap", "_filefoldmap", "_dirfoldmap", "_branch", - "_pl", "_dirs", "_ignore", "_nonnormalset"): + "_pl", "_dirs", "_ignore", "_nonnormalset", + "_otherparentset"): if a in self.__dict__: delattr(self, a) self._lastnormaltime = 0 self._dirty = False + self._updatedfiles.clear() self._parentwriters = 0 self._origpl = None @@ -456,8 +463,11 @@ self._dirty = True if source is not None: self._copymap[dest] = source + self._updatedfiles.add(source) + self._updatedfiles.add(dest) elif dest in self._copymap: del self._copymap[dest] + self._updatedfiles.add(dest) def copied(self, file): return self._copymap.get(file, None) @@ -474,6 +484,8 @@ if normed in self._filefoldmap: del self._filefoldmap[normed] + self._updatedfiles.add(f) + def _addpath(self, f, state, mode, size, mtime): oldstate = self[f] if state == 'a' or oldstate == 'r': @@ -490,9 +502,12 @@ if oldstate in "?r" and "_dirs" in self.__dict__: self._dirs.addpath(f) self._dirty = True + self._updatedfiles.add(f) self._map[f] = dirstatetuple(state, mode, size, mtime) if state != 'n' or mtime == -1: self._nonnormalset.add(f) + if size == -2: + self._otherparentset.add(f) def normal(self, f): '''Mark a file normal and clean.''' @@ -567,6 +582,7 @@ size = -1 elif entry[0] == 'n' and entry[2] == -2: # other parent size = -2 + self._otherparentset.add(f) self._map[f] = dirstatetuple('r', 0, size, 0) self._nonnormalset.add(f) if size == 0 and f in self._copymap: @@ -666,11 +682,13 @@ def clear(self): self._map = {} self._nonnormalset = set() + self._otherparentset = set() if "_dirs" in self.__dict__: delattr(self, "_dirs") self._copymap = {} self._pl = [nullid, nullid] self._lastnormaltime = 0 + self._updatedfiles.clear() self._dirty = True def rebuild(self, parent, allfiles, changedfiles=None): @@ -707,13 +725,15 @@ # emulate dropping timestamp in 'parsers.pack_dirstate' now = _getfsnow(self._opener) dmap = self._map - for f, e in dmap.iteritems(): - if e[0] == 'n' and e[3] == now: + for f in self._updatedfiles: + e = dmap.get(f) + if e is not None and e[0] == 'n' and e[3] == now: dmap[f] = dirstatetuple(e[0], e[1], e[2], -1) self._nonnormalset.add(f) # emulate that all 'dirstate.normal' results are written out self._lastnormaltime = 0 + self._updatedfiles.clear() # delay writing in-memory changes out tr.addfilegenerator('dirstate', (self._filename,), @@ -762,7 +782,7 @@ break st.write(parsers.pack_dirstate(self._map, self._copymap, self._pl, now)) - self._nonnormalset = nonnormalentries(self._map) + self._nonnormalset, self._otherparentset = nonnormalentries(self._map) st.close() self._lastnormaltime = 0 self._dirty = self._dirtypl = False @@ -1059,7 +1079,7 @@ # a) not matching matchfn b) ignored, c) missing, or d) under a # symlink directory. if not results and matchalways: - visit = dmap.keys() + visit = [f for f in dmap] else: visit = [f for f in dmap if f not in results and matchfn(f)] visit.sort() @@ -1095,9 +1115,9 @@ else: # We may not have walked the full directory tree above, # so stat and check everything we missed. - nf = iter(visit).next + iv = iter(visit) for st in util.statfiles([join(i) for i in visit]): - results[nf()] = st + results[next(iv)] = st return results def status(self, match, subrepos, ignored, clean, unknown): @@ -1224,8 +1244,9 @@ # use '_writedirstate' instead of 'write' to write changes certainly, # because the latter omits writing out if transaction is running. # output file will be used to create backup of dirstate at this point. - self._writedirstate(self._opener(filename, "w", atomictemp=True, - checkambig=True)) + if self._dirty or not self._opener.exists(filename): + self._writedirstate(self._opener(filename, "w", atomictemp=True, + checkambig=True)) if tr: # ensure that subsequent tr.writepending returns True for @@ -1239,8 +1260,13 @@ # end of this transaction tr.registertmp(filename, location='plain') - self._opener.write(prefix + self._filename + suffix, - self._opener.tryread(filename)) + backupname = prefix + self._filename + suffix + assert backupname != filename + self._opener.tryunlink(backupname) + # hardlink backup is okay because _writedirstate is always called + # with an "atomictemp=True" file. + util.copyfile(self._opener.join(filename), + self._opener.join(backupname), hardlink=True) def restorebackup(self, tr, suffix='', prefix=''): '''Restore dirstate by backup file with suffix''' diff -r ed5b25874d99 -r 4baf79a77afa mercurial/discovery.py --- a/mercurial/discovery.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/discovery.py Fri Mar 24 08:37:26 2017 -0700 @@ -343,38 +343,13 @@ oldhs.update(unsyncedheads) candidate_newhs.update(unsyncedheads) dhs = None # delta heads, the new heads on branch - discardedheads = set() if not repo.obsstore: + discardedheads = set() newhs = candidate_newhs else: - # remove future heads which are actually obsoleted by another - # pushed element: - # - # XXX as above, There are several cases this code does not handle - # XXX properly - # - # (1) if is public, it won't be affected by obsolete marker - # and a new is created - # - # (2) if the new heads have ancestors which are not obsolete and - # not ancestors of any other heads we will have a new head too. - # - # These two cases will be easy to handle for known changeset but - # much more tricky for unsynced changes. - # - # In addition, this code is confused by prune as it only looks for - # successors of the heads (none if pruned) leading to issue4354 - newhs = set() - for nh in candidate_newhs: - if nh in repo and repo[nh].phase() <= phases.public: - newhs.add(nh) - else: - for suc in obsolete.allsuccessors(repo.obsstore, [nh]): - if suc != nh and suc in allfuturecommon: - discardedheads.add(nh) - break - else: - newhs.add(nh) + newhs, discardedheads = _postprocessobsolete(pushop, + allfuturecommon, + candidate_newhs) unsynced = sorted(h for h in unsyncedheads if h not in discardedheads) if unsynced: if None in unsynced: @@ -434,3 +409,42 @@ repo.ui.note((" %s\n") % short(h)) if errormsg: raise error.Abort(errormsg, hint=hint) + +def _postprocessobsolete(pushop, futurecommon, candidate_newhs): + """post process the list of new heads with obsolescence information + + Exists as a subfunction to contain the complexity and allow extensions to + experiment with smarter logic. + Returns (newheads, discarded_heads) tuple + """ + # remove future heads which are actually obsoleted by another + # pushed element: + # + # XXX as above, There are several cases this code does not handle + # XXX properly + # + # (1) if is public, it won't be affected by obsolete marker + # and a new is created + # + # (2) if the new heads have ancestors which are not obsolete and + # not ancestors of any other heads we will have a new head too. + # + # These two cases will be easy to handle for known changeset but + # much more tricky for unsynced changes. + # + # In addition, this code is confused by prune as it only looks for + # successors of the heads (none if pruned) leading to issue4354 + repo = pushop.repo + newhs = set() + discarded = set() + for nh in candidate_newhs: + if nh in repo and repo[nh].phase() <= phases.public: + newhs.add(nh) + else: + for suc in obsolete.allsuccessors(repo.obsstore, [nh]): + if suc != nh and suc in futurecommon: + discarded.add(nh) + break + else: + newhs.add(nh) + return newhs, discarded diff -r ed5b25874d99 -r 4baf79a77afa mercurial/dispatch.py --- a/mercurial/dispatch.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/dispatch.py Fri Mar 24 08:37:26 2017 -0700 @@ -33,6 +33,7 @@ extensions, fancyopts, fileset, + help, hg, hook, profiling, @@ -91,6 +92,9 @@ if inst.hint: write(_("(%s)\n") % inst.hint) +def _formatargs(args): + return ' '.join(util.shellquote(a) for a in args) + def dispatch(req): "run the command specified in req.args" if req.ferr: @@ -122,8 +126,8 @@ _formatparse(ferr.write, inst) return -1 - msg = ' '.join(' ' in a and repr(a) or a for a in req.args) - starttime = time.time() + msg = _formatargs(req.args) + starttime = util.timer() ret = None try: ret = _runcatch(req) @@ -135,8 +139,11 @@ raise ret = -1 finally: - duration = time.time() - starttime + duration = util.timer() - starttime req.ui.flush() + if req.ui.logblockedtimes: + req.ui._blockedtimes['command_duration'] = duration * 1000 + req.ui.log('uiblocked', 'ui blocked ms', **req.ui._blockedtimes) req.ui.log("commandfinish", "%s exited %s after %0.2f seconds\n", msg, ret or 0, duration) return ret @@ -230,28 +237,35 @@ (inst.args[0], " ".join(inst.args[1]))) except error.CommandError as inst: if inst.args[0]: + ui.pager('help') ui.warn(_("hg %s: %s\n") % (inst.args[0], inst.args[1])) commands.help_(ui, inst.args[0], full=False, command=True) else: + ui.pager('help') ui.warn(_("hg: %s\n") % inst.args[1]) commands.help_(ui, 'shortlist') except error.ParseError as inst: _formatparse(ui.warn, inst) return -1 except error.UnknownCommand as inst: - ui.warn(_("hg: unknown command '%s'\n") % inst.args[0]) + nocmdmsg = _("hg: unknown command '%s'\n") % inst.args[0] try: # check if the command is in a disabled extension # (but don't check for extensions themselves) - commands.help_(ui, inst.args[0], unknowncmd=True) + formatted = help.formattedhelp(ui, inst.args[0], unknowncmd=True) + ui.warn(nocmdmsg) + ui.write(formatted) except (error.UnknownCommand, error.Abort): suggested = False if len(inst.args) == 2: sim = _getsimilar(inst.args[1], inst.args[0]) if sim: + ui.warn(nocmdmsg) _reportsimilar(ui.warn, sim) suggested = True if not suggested: + ui.pager('help') + ui.warn(nocmdmsg) commands.help_(ui, 'shortlist') except IOError: raise @@ -275,7 +289,7 @@ if num < len(givenargs): return givenargs[num] raise error.Abort(_('too few arguments for command alias')) - cmd = re.sub(r'\$(\d+|\$)', replacer, cmd) + cmd = re.sub(br'\$(\d+|\$)', replacer, cmd) givenargs = [x for i, x in enumerate(givenargs) if i not in nums] args = pycompat.shlexsplit(cmd) @@ -345,7 +359,8 @@ return '' cmd = re.sub(r'\$(\d+|\$)', _checkvar, self.definition[1:]) cmd = aliasinterpolate(self.name, args, cmd) - return ui.system(cmd, environ=env) + return ui.system(cmd, environ=env, + blockedtag='alias_%s' % self.name) self.fn = fn return @@ -460,7 +475,8 @@ args = aliasargs(entry[0], args) defaults = ui.config("defaults", cmd) if defaults: - args = map(util.expandpath, pycompat.shlexsplit(defaults)) + args + args = pycompat.maplist( + util.expandpath, pycompat.shlexsplit(defaults)) + args c = list(entry[1]) else: cmd = None @@ -655,107 +671,122 @@ rpath = _earlygetopt(["-R", "--repository", "--repo"], args) path, lui = _getlocal(ui, rpath) - # Configure extensions in phases: uisetup, extsetup, cmdtable, and - # reposetup. Programs like TortoiseHg will call _dispatch several - # times so we keep track of configured extensions in _loaded. - extensions.loadall(lui) - exts = [ext for ext in extensions.extensions() if ext[0] not in _loaded] - # Propagate any changes to lui.__class__ by extensions - ui.__class__ = lui.__class__ - - # (uisetup and extsetup are handled in extensions.loadall) - - for name, module in exts: - for objname, loadermod, loadername in extraloaders: - extraobj = getattr(module, objname, None) - if extraobj is not None: - getattr(loadermod, loadername)(ui, name, extraobj) - _loaded.add(name) - - # (reposetup is handled in hg.repository) - # Side-effect of accessing is debugcommands module is guaranteed to be # imported and commands.table is populated. debugcommands.command - addaliases(lui, commands.table) - - # All aliases and commands are completely defined, now. - # Check abbreviation/ambiguity of shell alias. - shellaliasfn = _checkshellalias(lui, ui, args) - if shellaliasfn: - with profiling.maybeprofile(lui): - return shellaliasfn() - - # check for fallback encoding - fallback = lui.config('ui', 'fallbackencoding') - if fallback: - encoding.fallbackencoding = fallback - - fullargs = args - cmd, func, args, options, cmdoptions = _parse(lui, args) - - if options["config"]: - raise error.Abort(_("option --config may not be abbreviated!")) - if options["cwd"]: - raise error.Abort(_("option --cwd may not be abbreviated!")) - if options["repository"]: - raise error.Abort(_( - "option -R has to be separated from other options (e.g. not -qR) " - "and --repository may only be abbreviated as --repo!")) - - if options["encoding"]: - encoding.encoding = options["encoding"] - if options["encodingmode"]: - encoding.encodingmode = options["encodingmode"] - if options["time"]: - def get_times(): - t = os.times() - if t[4] == 0.0: # Windows leaves this as zero, so use time.clock() - t = (t[0], t[1], t[2], t[3], time.clock()) - return t - s = get_times() - def print_time(): - t = get_times() - ui.warn(_("time: real %.3f secs (user %.3f+%.3f sys %.3f+%.3f)\n") % - (t[4]-s[4], t[0]-s[0], t[2]-s[2], t[1]-s[1], t[3]-s[3])) - atexit.register(print_time) - uis = set([ui, lui]) if req.repo: uis.add(req.repo.ui) - if options['verbose'] or options['debug'] or options['quiet']: - for opt in ('verbose', 'debug', 'quiet'): - val = str(bool(options[opt])) - for ui_ in uis: - ui_.setconfig('ui', opt, val, '--' + opt) - - if options['profile']: + if '--profile' in args: for ui_ in uis: ui_.setconfig('profiling', 'enabled', 'true', '--profile') - if options['traceback']: - for ui_ in uis: - ui_.setconfig('ui', 'traceback', 'on', '--traceback') + with profiling.maybeprofile(lui): + # Configure extensions in phases: uisetup, extsetup, cmdtable, and + # reposetup. Programs like TortoiseHg will call _dispatch several + # times so we keep track of configured extensions in _loaded. + extensions.loadall(lui) + exts = [ext for ext in extensions.extensions() if ext[0] not in _loaded] + # Propagate any changes to lui.__class__ by extensions + ui.__class__ = lui.__class__ + + # (uisetup and extsetup are handled in extensions.loadall) + + for name, module in exts: + for objname, loadermod, loadername in extraloaders: + extraobj = getattr(module, objname, None) + if extraobj is not None: + getattr(loadermod, loadername)(ui, name, extraobj) + _loaded.add(name) + + # (reposetup is handled in hg.repository) + + addaliases(lui, commands.table) + + # All aliases and commands are completely defined, now. + # Check abbreviation/ambiguity of shell alias. + shellaliasfn = _checkshellalias(lui, ui, args) + if shellaliasfn: + return shellaliasfn() + + # check for fallback encoding + fallback = lui.config('ui', 'fallbackencoding') + if fallback: + encoding.fallbackencoding = fallback + + fullargs = args + cmd, func, args, options, cmdoptions = _parse(lui, args) + + if options["config"]: + raise error.Abort(_("option --config may not be abbreviated!")) + if options["cwd"]: + raise error.Abort(_("option --cwd may not be abbreviated!")) + if options["repository"]: + raise error.Abort(_( + "option -R has to be separated from other options (e.g. not " + "-qR) and --repository may only be abbreviated as --repo!")) - if options['noninteractive']: - for ui_ in uis: - ui_.setconfig('ui', 'interactive', 'off', '-y') + if options["encoding"]: + encoding.encoding = options["encoding"] + if options["encodingmode"]: + encoding.encodingmode = options["encodingmode"] + if options["time"]: + def get_times(): + t = os.times() + if t[4] == 0.0: + # Windows leaves this as zero, so use time.clock() + t = (t[0], t[1], t[2], t[3], time.clock()) + return t + s = get_times() + def print_time(): + t = get_times() + ui.warn( + _("time: real %.3f secs (user %.3f+%.3f sys %.3f+%.3f)\n") % + (t[4]-s[4], t[0]-s[0], t[2]-s[2], t[1]-s[1], t[3]-s[3])) + atexit.register(print_time) - if cmdoptions.get('insecure', False): + if options['verbose'] or options['debug'] or options['quiet']: + for opt in ('verbose', 'debug', 'quiet'): + val = str(bool(options[opt])) + if pycompat.ispy3: + val = val.encode('ascii') + for ui_ in uis: + ui_.setconfig('ui', opt, val, '--' + opt) + + if options['traceback']: + for ui_ in uis: + ui_.setconfig('ui', 'traceback', 'on', '--traceback') + + if options['noninteractive']: + for ui_ in uis: + ui_.setconfig('ui', 'interactive', 'off', '-y') + + if util.parsebool(options['pager']): + ui.pager('internal-always-' + cmd) + elif options['pager'] != 'auto': + ui.disablepager() + + if cmdoptions.get('insecure', False): + for ui_ in uis: + ui_.insecureconnections = True + + # setup color handling + coloropt = options['color'] for ui_ in uis: - ui_.insecureconnections = True + if coloropt: + ui_.setconfig('ui', 'color', coloropt, '--color') + color.setup(ui_) - if options['version']: - return commands.version_(ui) - if options['help']: - return commands.help_(ui, cmd, command=cmd is not None) - elif not cmd: - return commands.help_(ui, 'shortlist') + if options['version']: + return commands.version_(ui) + if options['help']: + return commands.help_(ui, cmd, command=cmd is not None) + elif not cmd: + return commands.help_(ui, 'shortlist') - with profiling.maybeprofile(lui): repo = None cmdpats = args[:] if not func.norepo: @@ -802,7 +833,7 @@ elif rpath: ui.warn(_("warning: --repository ignored\n")) - msg = ' '.join(' ' in a and repr(a) or a for a in fullargs) + msg = _formatargs(fullargs) ui.log("command", '%s\n', msg) strcmdopt = pycompat.strkwargs(cmdoptions) d = lambda: util.checksignature(func)(ui, *args, **strcmdopt) @@ -835,6 +866,8 @@ if ui.config('ui', 'supportcontact', None) is None: for name, mod in extensions.extensions(): testedwith = getattr(mod, 'testedwith', '') + if pycompat.ispy3 and isinstance(testedwith, str): + testedwith = testedwith.encode(u'utf-8') report = getattr(mod, 'buglink', _('the extension author.')) if not testedwith.strip(): # We found an untested extension. It's likely the culprit. @@ -855,7 +888,7 @@ worst = name, nearest, report if worst[0] is not None: name, testedwith, report = worst - if not isinstance(testedwith, str): + if not isinstance(testedwith, (bytes, str)): testedwith = '.'.join([str(c) for c in testedwith]) warning = (_('** Unknown exception encountered with ' 'possibly-broken third-party extension %s\n' @@ -869,7 +902,12 @@ bugtracker = _("https://mercurial-scm.org/wiki/BugTracker") warning = (_("** unknown exception encountered, " "please report by visiting\n** ") + bugtracker + '\n') - warning += ((_("** Python %s\n") % sys.version.replace('\n', '')) + + if pycompat.ispy3: + sysversion = sys.version.encode(u'utf-8') + else: + sysversion = sys.version + sysversion = sysversion.replace('\n', '') + warning += ((_("** Python %s\n") % sysversion) + (_("** Mercurial Distributed SCM (version %s)\n") % util.version()) + (_("** Extensions loaded: %s\n") % diff -r ed5b25874d99 -r 4baf79a77afa mercurial/encoding.py --- a/mercurial/encoding.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/encoding.py Fri Mar 24 08:37:26 2017 -0700 @@ -196,6 +196,24 @@ except LookupError as k: raise error.Abort(k, hint="please check your locale settings") +def unitolocal(u): + """Convert a unicode string to a byte string of local encoding""" + return tolocal(u.encode('utf-8')) + +def unifromlocal(s): + """Convert a byte string of local encoding to a unicode string""" + return fromlocal(s).decode('utf-8') + +# converter functions between native str and byte string. use these if the +# character encoding is not aware (e.g. exception message) or is known to +# be locale dependent (e.g. date formatting.) +if pycompat.ispy3: + strtolocal = unitolocal + strfromlocal = unifromlocal +else: + strtolocal = str + strfromlocal = str + if not _nativeenviron: # now encoding and helper functions are available, recreate the environ # dict to be exported to other modules diff -r ed5b25874d99 -r 4baf79a77afa mercurial/error.py --- a/mercurial/error.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/error.py Fri Mar 24 08:37:26 2017 -0700 @@ -22,7 +22,7 @@ pass remaining arguments to the exception class. """ def __init__(self, *args, **kw): - self.hint = kw.pop('hint', None) + self.hint = kw.pop(r'hint', None) super(Hint, self).__init__(*args, **kw) class RevlogError(Hint, Exception): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/exchange.py --- a/mercurial/exchange.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/exchange.py Fri Mar 24 08:37:26 2017 -0700 @@ -1737,9 +1737,15 @@ if url.startswith('remote:http:') or url.startswith('remote:https:'): captureoutput = True try: + # note: outside bundle1, 'heads' is expected to be empty and this + # 'check_heads' call wil be a no-op check_heads(repo, heads, 'uploading changes') # push can proceed - if util.safehasattr(cg, 'params'): + if not util.safehasattr(cg, 'params'): + # legacy case: bundle1 (changegroup 01) + lockandtr[1] = repo.lock() + r = cg.apply(repo, source, url) + else: r = None try: def gettransaction(): @@ -1778,9 +1784,6 @@ mandatory=False) parts.append(part) raise - else: - lockandtr[1] = repo.lock() - r = cg.apply(repo, source, url) finally: lockmod.release(lockandtr[2], lockandtr[1], lockandtr[0]) if recordout is not None: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/exewrapper.c --- a/mercurial/exewrapper.c Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/exewrapper.c Fri Mar 24 08:37:26 2017 -0700 @@ -67,51 +67,35 @@ } pydll = NULL; - /* - We first check, that environment variable PYTHONHOME is *not* set. - This just mimicks the behavior of the regular python.exe, which uses - PYTHONHOME to find its installation directory (if it has been set). - Note: Users of HackableMercurial are expected to *not* set PYTHONHOME! - */ - if (GetEnvironmentVariable("PYTHONHOME", envpyhome, - sizeof(envpyhome)) == 0) - { - /* - Environment var PYTHONHOME is *not* set. Let's see if we are - running inside a HackableMercurial. - */ + + p = strrchr(pyhome, '\\'); + if (p == NULL) { + err = "can't find backslash in module filename"; + goto bail; + } + *p = 0; /* cut at directory */ + + /* check for private Python of HackableMercurial */ + strcat_s(pyhome, sizeof(pyhome), "\\hg-python"); - p = strrchr(pyhome, '\\'); - if (p == NULL) { - err = "can't find backslash in module filename"; + hfind = FindFirstFile(pyhome, &fdata); + if (hfind != INVALID_HANDLE_VALUE) { + /* Path .\hg-python exists. We are probably in HackableMercurial + scenario, so let's load python dll from this dir. */ + FindClose(hfind); + strcpy_s(pydllfile, sizeof(pydllfile), pyhome); + strcat_s(pydllfile, sizeof(pydllfile), "\\" HGPYTHONLIB ".dll"); + pydll = LoadLibrary(pydllfile); + if (pydll == NULL) { + err = "failed to load private Python DLL " HGPYTHONLIB ".dll"; goto bail; } - *p = 0; /* cut at directory */ - - /* check for private Python of HackableMercurial */ - strcat_s(pyhome, sizeof(pyhome), "\\hg-python"); - - hfind = FindFirstFile(pyhome, &fdata); - if (hfind != INVALID_HANDLE_VALUE) { - /* path pyhome exists, let's use it */ - FindClose(hfind); - strcpy_s(pydllfile, sizeof(pydllfile), pyhome); - strcat_s(pydllfile, sizeof(pydllfile), - "\\" HGPYTHONLIB ".dll"); - pydll = LoadLibrary(pydllfile); - if (pydll == NULL) { - err = "failed to load private Python DLL " - HGPYTHONLIB ".dll"; - goto bail; - } - Py_SetPythonHome = (void*)GetProcAddress(pydll, - "Py_SetPythonHome"); - if (Py_SetPythonHome == NULL) { - err = "failed to get Py_SetPythonHome"; - goto bail; - } - Py_SetPythonHome(pyhome); + Py_SetPythonHome = (void*)GetProcAddress(pydll, "Py_SetPythonHome"); + if (Py_SetPythonHome == NULL) { + err = "failed to get Py_SetPythonHome"; + goto bail; } + Py_SetPythonHome(pyhome); } if (pydll == NULL) { diff -r ed5b25874d99 -r 4baf79a77afa mercurial/extensions.py --- a/mercurial/extensions.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/extensions.py Fri Mar 24 08:37:26 2017 -0700 @@ -8,6 +8,7 @@ from __future__ import absolute_import import imp +import inspect import os from .i18n import ( @@ -17,6 +18,7 @@ from . import ( cmdutil, + encoding, error, pycompat, util, @@ -103,11 +105,16 @@ mod = _importh(name) return mod +def _forbytes(inst): + """Portably format an import error into a form suitable for + %-formatting into bytestrings.""" + return encoding.strtolocal(str(inst)) + def _reportimporterror(ui, err, failed, next): # note: this ui.debug happens before --debug is processed, # Use --config ui.debug=1 to see them. ui.debug('could not import %s (%s): trying %s\n' - % (failed, err, next)) + % (failed, _forbytes(err), next)) if ui.debugflag: ui.traceback() @@ -150,7 +157,7 @@ try: extsetup(ui) except TypeError: - if extsetup.func_code.co_argcount != 0: + if inspect.getargspec(extsetup).args: raise extsetup() # old extsetup with no ui argument @@ -159,7 +166,7 @@ newindex = len(_order) for (name, path) in result: if path: - if path[0] == '!': + if path[0:1] == '!': _disabledextensions[name] = path[1:] continue try: @@ -167,6 +174,7 @@ except KeyboardInterrupt: raise except Exception as inst: + inst = _forbytes(inst) if path: ui.warn(_("*** failed to import extension %s from %s: %s\n") % (name, path, inst)) @@ -362,7 +370,8 @@ '''find paths of disabled extensions. returns a dict of {name: path} removes /__init__.py from packages if strip_init is True''' import hgext - extpath = os.path.dirname(os.path.abspath(hgext.__file__)) + extpath = os.path.dirname( + os.path.abspath(pycompat.fsencode(hgext.__file__))) try: # might not be a filesystem path files = os.listdir(extpath) except OSError: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/filemerge.py --- a/mercurial/filemerge.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/filemerge.py Fri Mar 24 08:37:26 2017 -0700 @@ -35,7 +35,9 @@ def _toolbool(ui, tool, part, default=False): return ui.configbool("merge-tools", tool + "." + part, default) -def _toollist(ui, tool, part, default=[]): +def _toollist(ui, tool, part, default=None): + if default is None: + default = [] return ui.configlist("merge-tools", tool + "." + part, default) internals = {} @@ -489,8 +491,11 @@ args = util.interpolate(r'\$', replace, args, lambda s: util.shellquote(util.localpath(s))) cmd = toolpath + ' ' + args + if _toolbool(ui, tool, "gui"): + repo.ui.status(_('running merge tool %s for file %s\n') % + (tool, fcd.path())) repo.ui.debug('launching merge tool: %s\n' % cmd) - r = ui.system(cmd, cwd=repo.root, environ=env) + r = ui.system(cmd, cwd=repo.root, environ=env, blockedtag='mergetool') repo.ui.debug('merge tool returned: %s\n' % r) return True, r, False @@ -582,7 +587,7 @@ pre = "%s~%s." % (os.path.basename(fullbase), prefix) (fd, name) = tempfile.mkstemp(prefix=pre, suffix=ext) data = repo.wwritedata(ctx.path(), ctx.data()) - f = os.fdopen(fd, "wb") + f = os.fdopen(fd, pycompat.sysstr("wb")) f.write(data) f.close() return name diff -r ed5b25874d99 -r 4baf79a77afa mercurial/fileset.py --- a/mercurial/fileset.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/fileset.py Fri Mar 24 08:37:26 2017 -0700 @@ -15,6 +15,7 @@ merge, parser, registrar, + scmutil, util, ) @@ -438,6 +439,52 @@ s.append(f) return s +@predicate('revs(revs, pattern)') +def revs(mctx, x): + """Evaluate set in the specified revisions. If the revset match multiple + revs, this will return file matching pattern in any of the revision. + """ + # i18n: "revs" is a keyword + r, x = getargs(x, 2, 2, _("revs takes two arguments")) + # i18n: "revs" is a keyword + revspec = getstring(r, _("first argument to revs must be a revision")) + repo = mctx.ctx.repo() + revs = scmutil.revrange(repo, [revspec]) + + found = set() + result = [] + for r in revs: + ctx = repo[r] + for f in getset(mctx.switch(ctx, _buildstatus(ctx, x)), x): + if f not in found: + found.add(f) + result.append(f) + return result + +@predicate('status(base, rev, pattern)') +def status(mctx, x): + """Evaluate predicate using status change between ``base`` and + ``rev``. Examples: + + - ``status(3, 7, added())`` - matches files added from "3" to "7" + """ + repo = mctx.ctx.repo() + # i18n: "status" is a keyword + b, r, x = getargs(x, 3, 3, _("status takes three arguments")) + # i18n: "status" is a keyword + baseerr = _("first argument to status must be a revision") + baserevspec = getstring(b, baseerr) + if not baserevspec: + raise error.ParseError(baseerr) + reverr = _("second argument to status must be a revision") + revspec = getstring(r, reverr) + if not revspec: + raise error.ParseError(reverr) + basenode, node = scmutil.revpair(repo, [baserevspec, revspec]) + basectx = repo[basenode] + ctx = repo[node] + return getset(mctx.switch(ctx, _buildstatus(ctx, x, basectx=basectx)), x) + @predicate('subrepo([pattern])') def subrepo(mctx, x): """Subrepositories whose paths match the given pattern. @@ -474,7 +521,7 @@ } class matchctx(object): - def __init__(self, ctx, subset=None, status=None): + def __init__(self, ctx, subset, status=None): self.ctx = ctx self.subset = subset self._status = status @@ -497,39 +544,71 @@ if (f in self.ctx and f not in removed) or f in unknown) def narrow(self, files): return matchctx(self.ctx, self.filter(files), self._status) + def switch(self, ctx, status=None): + subset = self.filter(_buildsubset(ctx, status)) + return matchctx(ctx, subset, status) + +class fullmatchctx(matchctx): + """A match context where any files in any revisions should be valid""" + + def __init__(self, ctx, status=None): + subset = _buildsubset(ctx, status) + super(fullmatchctx, self).__init__(ctx, subset, status) + def switch(self, ctx, status=None): + return fullmatchctx(ctx, status) + +# filesets using matchctx.switch() +_switchcallers = [ + 'revs', + 'status', +] def _intree(funcs, tree): if isinstance(tree, tuple): if tree[0] == 'func' and tree[1][0] == 'symbol': if tree[1][1] in funcs: return True + if tree[1][1] in _switchcallers: + # arguments won't be evaluated in the current context + return False for s in tree[1:]: if _intree(funcs, s): return True return False +def _buildsubset(ctx, status): + if status: + subset = [] + for c in status: + subset.extend(c) + return subset + else: + return list(ctx.walk(ctx.match([]))) + def getfileset(ctx, expr): tree = parse(expr) + return getset(fullmatchctx(ctx, _buildstatus(ctx, tree)), tree) +def _buildstatus(ctx, tree, basectx=None): # do we need status info? + + # temporaty boolean to simplify the next conditional + purewdir = ctx.rev() is None and basectx is None + if (_intree(_statuscallers, tree) or # Using matchctx.existing() on a workingctx requires us to check # for deleted files. - (ctx.rev() is None and _intree(_existingcallers, tree))): + (purewdir and _intree(_existingcallers, tree))): unknown = _intree(['unknown'], tree) ignored = _intree(['ignored'], tree) r = ctx.repo() - status = r.status(ctx.p1(), ctx, - unknown=unknown, ignored=ignored, clean=True) - subset = [] - for c in status: - subset.extend(c) + if basectx is None: + basectx = ctx.p1() + return r.status(basectx, ctx, + unknown=unknown, ignored=ignored, clean=True) else: - status = None - subset = list(ctx.walk(ctx.match([]))) - - return getset(matchctx(ctx, subset, status), tree) + return None def prettyformat(tree): return parser.prettyformat(tree, ('string', 'symbol')) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/formatter.py --- a/mercurial/formatter.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/formatter.py Fri Mar 24 08:37:26 2017 -0700 @@ -12,6 +12,7 @@ - fm.write() for unconditional output - fm.condwrite() to show some extra data conditionally in plain output +- fm.context() to provide changectx to template output - fm.data() to provide extra data to JSON or template output - fm.plain() to show raw text that isn't provided to JSON or template output @@ -171,6 +172,9 @@ # name is mandatory argument for now, but it could be optional if # we have default template keyword, e.g. {item} return self._converter.formatlist(data, name, fmt, sep) + def context(self, **ctxs): + '''insert context objects to be used to render template keywords''' + pass def data(self, **data): '''insert data into item that's not shown in default output''' self._item.update(data) @@ -257,24 +261,26 @@ pass class debugformatter(baseformatter): - def __init__(self, ui, topic, opts): + def __init__(self, ui, out, topic, opts): baseformatter.__init__(self, ui, topic, opts, _nullconverter) - self._ui.write("%s = [\n" % self._topic) + self._out = out + self._out.write("%s = [\n" % self._topic) def _showitem(self): - self._ui.write(" " + repr(self._item) + ",\n") + self._out.write(" " + repr(self._item) + ",\n") def end(self): baseformatter.end(self) - self._ui.write("]\n") + self._out.write("]\n") class pickleformatter(baseformatter): - def __init__(self, ui, topic, opts): + def __init__(self, ui, out, topic, opts): baseformatter.__init__(self, ui, topic, opts, _nullconverter) + self._out = out self._data = [] def _showitem(self): self._data.append(self._item) def end(self): baseformatter.end(self) - self._ui.write(pickle.dumps(self._data)) + self._out.write(pickle.dumps(self._data)) def _jsonifyobj(v): if isinstance(v, dict): @@ -289,34 +295,35 @@ return 'true' elif v is False: return 'false' - elif isinstance(v, (int, float)): + elif isinstance(v, (int, long, float)): return str(v) else: return '"%s"' % encoding.jsonescape(v) class jsonformatter(baseformatter): - def __init__(self, ui, topic, opts): + def __init__(self, ui, out, topic, opts): baseformatter.__init__(self, ui, topic, opts, _nullconverter) - self._ui.write("[") - self._ui._first = True + self._out = out + self._out.write("[") + self._first = True def _showitem(self): - if self._ui._first: - self._ui._first = False + if self._first: + self._first = False else: - self._ui.write(",") + self._out.write(",") - self._ui.write("\n {\n") + self._out.write("\n {\n") first = True for k, v in sorted(self._item.items()): if first: first = False else: - self._ui.write(",\n") - self._ui.write(' "%s": %s' % (k, _jsonifyobj(v))) - self._ui.write("\n }") + self._out.write(",\n") + self._out.write(' "%s": %s' % (k, _jsonifyobj(v))) + self._out.write("\n }") def end(self): baseformatter.end(self) - self._ui.write("\n]\n") + self._out.write("\n]\n") class _templateconverter(object): '''convert non-primitive data types to be processed by templater''' @@ -342,13 +349,33 @@ lambda d: fmt % d[name]) class templateformatter(baseformatter): - def __init__(self, ui, topic, opts): + def __init__(self, ui, out, topic, opts): baseformatter.__init__(self, ui, topic, opts, _templateconverter) + self._out = out self._topic = topic - self._t = gettemplater(ui, topic, opts.get('template', '')) + self._t = gettemplater(ui, topic, opts.get('template', ''), + cache=templatekw.defaulttempl) + self._cache = {} # for templatekw/funcs to store reusable data + def context(self, **ctxs): + '''insert context objects to be used to render template keywords''' + assert all(k == 'ctx' for k in ctxs) + self._item.update(ctxs) def _showitem(self): - g = self._t(self._topic, ui=self._ui, **self._item) - self._ui.write(templater.stringify(g)) + # TODO: add support for filectx. probably each template keyword or + # function will have to declare dependent resources. e.g. + # @templatekeyword(..., requires=('ctx',)) + if 'ctx' in self._item: + props = templatekw.keywords.copy() + # explicitly-defined fields precede templatekw + props.update(self._item) + # but template resources must be always available + props['templ'] = self._t + props['repo'] = props['ctx'].repo() + props['revcache'] = {} + else: + props = self._item + g = self._t(self._topic, ui=self._ui, cache=self._cache, **props) + self._out.write(templater.stringify(g)) def lookuptemplate(ui, topic, tmpl): # looks like a literal template? @@ -382,17 +409,17 @@ # constant string? return tmpl, None -def gettemplater(ui, topic, spec): +def gettemplater(ui, topic, spec, cache=None): tmpl, mapfile = lookuptemplate(ui, topic, spec) assert not (tmpl and mapfile) if mapfile: - return templater.templater.frommapfile(mapfile) - return maketemplater(ui, topic, tmpl) + return templater.templater.frommapfile(mapfile, cache=cache) + return maketemplater(ui, topic, tmpl, cache=cache) -def maketemplater(ui, topic, tmpl, filters=None, cache=None): +def maketemplater(ui, topic, tmpl, cache=None): """Create a templater from a string template 'tmpl'""" aliases = ui.configitems('templatealias') - t = templater.templater(filters=filters, cache=cache, aliases=aliases) + t = templater.templater(cache=cache, aliases=aliases) if tmpl: t.cache[topic] = tmpl return t @@ -400,17 +427,17 @@ def formatter(ui, topic, opts): template = opts.get("template", "") if template == "json": - return jsonformatter(ui, topic, opts) + return jsonformatter(ui, ui, topic, opts) elif template == "pickle": - return pickleformatter(ui, topic, opts) + return pickleformatter(ui, ui, topic, opts) elif template == "debug": - return debugformatter(ui, topic, opts) + return debugformatter(ui, ui, topic, opts) elif template != "": - return templateformatter(ui, topic, opts) + return templateformatter(ui, ui, topic, opts) # developer config: ui.formatdebug elif ui.configbool('ui', 'formatdebug'): - return debugformatter(ui, topic, opts) + return debugformatter(ui, ui, topic, opts) # deprecated config: ui.formatjson elif ui.configbool('ui', 'formatjson'): - return jsonformatter(ui, topic, opts) + return jsonformatter(ui, ui, topic, opts) return plainformatter(ui, topic, opts) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/graphmod.py --- a/mercurial/graphmod.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/graphmod.py Fri Mar 24 08:37:26 2017 -0700 @@ -22,6 +22,7 @@ from .node import nullrev from . import ( revset, + smartset, util, ) @@ -67,8 +68,8 @@ if gp is None: # precompute slow query as we know reachableroots() goes # through all revs (issue4782) - if not isinstance(revs, revset.baseset): - revs = revset.baseset(revs) + if not isinstance(revs, smartset.baseset): + revs = smartset.baseset(revs) gp = gpcache[mpar] = sorted(set(revset.reachableroots( repo, revs, [mpar]))) if not gp: @@ -181,6 +182,9 @@ knownparents = [] newparents = [] for ptype, parent in parents: + if parent == rev: + # self reference (should only be seen in null rev) + continue if parent in seen: knownparents.append(parent) else: @@ -190,8 +194,7 @@ ncols = len(seen) nextseen = seen[:] nextseen[nodeidx:nodeidx + 1] = newparents - edges = [(nodeidx, nextseen.index(p)) - for p in knownparents if p != nullrev] + edges = [(nodeidx, nextseen.index(p)) for p in knownparents] seen[:] = nextseen while len(newparents) > 2: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help.py --- a/mercurial/help.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/help.py Fri Mar 24 08:37:26 2017 -0700 @@ -33,14 +33,17 @@ webcommands, ) -_exclkeywords = [ +_exclkeywords = set([ + "(ADVANCED)", "(DEPRECATED)", "(EXPERIMENTAL)", + # i18n: "(ADVANCED)" is a keyword, must be translated consistently + _("(ADVANCED)"), # i18n: "(DEPRECATED)" is a keyword, must be translated consistently _("(DEPRECATED)"), # i18n: "(EXPERIMENTAL)" is a keyword, must be translated consistently _("(EXPERIMENTAL)"), - ] + ]) def listexts(header, exts, indent=1, showdeprecated=False): '''return a text listing of the given extensions''' @@ -186,6 +189,8 @@ internalstable = sorted([ (['bundles'], _('Bundles'), loaddoc('bundles', subdir='internals')), + (['censor'], _('Censor'), + loaddoc('censor', subdir='internals')), (['changegroups'], _('Changegroups'), loaddoc('changegroups', subdir='internals')), (['requirements'], _('Repository Requirements'), @@ -205,6 +210,7 @@ return ''.join(lines) helptable = sorted([ + (['color'], _("Colorizing Outputs"), loaddoc('color')), (["config", "hgrc"], _("Configuration Files"), loaddoc('config')), (["dates"], _("Date Formats"), loaddoc('dates')), (["patterns"], _("File Name Patterns"), loaddoc('patterns')), @@ -230,6 +236,7 @@ loaddoc('scripting')), (['internals'], _("Technical implementation topics"), internalshelp), + (['pager'], _("Pager Support"), loaddoc('pager')), ]) # Maps topics with sub-topics to a list of their sub-topics. @@ -605,3 +612,49 @@ rst.extend(helplist(None, **opts)) return ''.join(rst) + +def formattedhelp(ui, name, keep=None, unknowncmd=False, full=True, **opts): + """get help for a given topic (as a dotted name) as rendered rst + + Either returns the rendered help text or raises an exception. + """ + if keep is None: + keep = [] + else: + keep = list(keep) # make a copy so we can mutate this later + fullname = name + section = None + subtopic = None + if name and '.' in name: + name, remaining = name.split('.', 1) + remaining = encoding.lower(remaining) + if '.' in remaining: + subtopic, section = remaining.split('.', 1) + else: + if name in subtopics: + subtopic = remaining + else: + section = remaining + textwidth = ui.configint('ui', 'textwidth', 78) + termwidth = ui.termwidth() - 2 + if textwidth <= 0 or termwidth < textwidth: + textwidth = termwidth + text = help_(ui, name, + subtopic=subtopic, unknowncmd=unknowncmd, full=full, **opts) + + formatted, pruned = minirst.format(text, textwidth, keep=keep, + section=section) + + # We could have been given a weird ".foo" section without a name + # to look for, or we could have simply failed to found "foo.bar" + # because bar isn't a section of foo + if section and not (formatted and name): + raise error.Abort(_("help section not found: %s") % fullname) + + if 'verbose' in pruned: + keep.append('omitted') + else: + keep.append('notomitted') + formatted, pruned = minirst.format(text, textwidth, keep=keep, + section=section) + return formatted diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/color.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/help/color.txt Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,134 @@ +Mercurial can colorizes output from several commands. + +For example, the diff command shows additions in green and deletions +in red, while the status command shows modified files in magenta. Many +other commands have analogous colors. It is possible to customize +these colors. + +To enable color use:: + + [ui] + color = auto + +Mode +==== + +Mercurial can use various system to display color. The supported modes are +``ansi``, ``win32``, and ``terminfo``. See :hg:`help config.color` for details +about how to control the mode + +Effects +======= + +Other effects in addition to color, like bold and underlined text, are +also available. By default, the terminfo database is used to find the +terminal codes used to change color and effect. If terminfo is not +available, then effects are rendered with the ECMA-48 SGR control +function (aka ANSI escape codes). + +The available effects in terminfo mode are 'blink', 'bold', 'dim', +'inverse', 'invisible', 'italic', 'standout', and 'underline'; in +ECMA-48 mode, the options are 'bold', 'inverse', 'italic', and +'underline'. How each is rendered depends on the terminal emulator. +Some may not be available for a given terminal type, and will be +silently ignored. + +If the terminfo entry for your terminal is missing codes for an effect +or has the wrong codes, you can add or override those codes in your +configuration:: + + [color] + terminfo.dim = \E[2m + +where '\E' is substituted with an escape character. + +Labels +====== + +Text receives color effects depending on the labels that it has. Many +default Mercurial commands emit labelled text. You can also define +your own labels in templates using the label function, see :hg:`help +templates`. A single portion of text may have more than one label. In +that case, effects given to the last label will override any other +effects. This includes the special "none" effect, which nullifies +other effects. + +Labels are normally invisible. In order to see these labels and their +position in the text, use the global --color=debug option. The same +anchor text may be associated to multiple labels, e.g. + + [log.changeset changeset.secret|changeset: 22611:6f0a53c8f587] + +The following are the default effects for some default labels. Default +effects may be overridden from your configuration file:: + + [color] + status.modified = blue bold underline red_background + status.added = green bold + status.removed = red bold blue_background + status.deleted = cyan bold underline + status.unknown = magenta bold underline + status.ignored = black bold + + # 'none' turns off all effects + status.clean = none + status.copied = none + + qseries.applied = blue bold underline + qseries.unapplied = black bold + qseries.missing = red bold + + diff.diffline = bold + diff.extended = cyan bold + diff.file_a = red bold + diff.file_b = green bold + diff.hunk = magenta + diff.deleted = red + diff.inserted = green + diff.changed = white + diff.tab = + diff.trailingwhitespace = bold red_background + + # Blank so it inherits the style of the surrounding label + changeset.public = + changeset.draft = + changeset.secret = + + resolve.unresolved = red bold + resolve.resolved = green bold + + bookmarks.active = green + + branches.active = none + branches.closed = black bold + branches.current = green + branches.inactive = none + + tags.normal = green + tags.local = black bold + + rebase.rebased = blue + rebase.remaining = red bold + + shelve.age = cyan + shelve.newest = green bold + shelve.name = blue bold + + histedit.remaining = red bold + +Custom colors +============= + +Because there are only eight standard colors, this module allows you +to define color names for other color slots which might be available +for your terminal type, assuming terminfo mode. For instance:: + + color.brightblue = 12 + color.pink = 207 + color.orange = 202 + +to set 'brightblue' to color slot 12 (useful for 16 color terminals +that have brighter colors defined in the upper eight) and, 'pink' and +'orange' to colors in 256-color xterm's default color cube. These +defined colors may then be used as any of the pre-defined eight, +including appending '_background' to set the background to that color. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/config.txt --- a/mercurial/help/config.txt Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/help/config.txt Fri Mar 24 08:37:26 2017 -0700 @@ -56,6 +56,7 @@ - ``/.hg/hgrc`` (per-repository) - ``$HOME/.hgrc`` (per-user) + - ``${XDG_CONFIG_HOME:-$HOME/.config}/hg/hgrc`` (per-user) - ``/etc/mercurial/hgrc`` (per-installation) - ``/etc/mercurial/hgrc.d/*.rc`` (per-installation) - ``/etc/mercurial/hgrc`` (per-system) @@ -276,7 +277,7 @@ will let you do ``hg echo foo`` to have ``foo`` printed in your terminal. A better example might be:: - purge = !$HG status --no-status --unknown -0 re: | xargs -0 rm + purge = !$HG status --no-status --unknown -0 re: | xargs -0 rm -f which will make ``hg purge`` delete all unknown files in the repository in the same manner as the purge extension. @@ -385,6 +386,46 @@ If no suitable authentication entry is found, the user is prompted for credentials as usual if required by the remote. +``color`` +--------- + +Configure the Mercurial color mode. For details about how to define your custom +effect and style see :hg:`help color`. + +``mode`` + String: control the method used to output color. One of ``auto``, ``ansi``, + ``win32``, ``terminfo`` or ``debug``. In auto mode the color extension will + use ANSI mode by default (or win32 mode on Windows) if it detects a + terminal. Any invalid value will disable color. + +``pagermode`` + String: optinal override of ``color.mode`` used with pager (from the pager + extensions). + + On some systems, terminfo mode may cause problems when using + color with the pager extension and less -R. less with the -R option + will only display ECMA-48 color codes, and terminfo mode may sometimes + emit codes that less doesn't understand. You can work around this by + either using ansi mode (or auto mode), or by using less -r (which will + pass through all terminal control codes, not just color control + codes). + + On some systems (such as MSYS in Windows), the terminal may support + a different color mode than the pager (activated via the "pager" + extension). + +``commands`` +------------ + +``status.relative`` + Make paths in ``hg status`` output relative to the current directory. + (default: False) + +``update.requiredest`` + Require that the user pass a destination when running ``hg update``. + For example, ``hg update .::`` will be allowed, but a plain ``hg update`` + will be disallowed. + (default: False) ``committemplate`` ------------------ @@ -700,8 +741,8 @@ Example for ``~/.hgrc``:: [extensions] - # (the color extension will get loaded from Mercurial's path) - color = + # (the churn extension will get loaded from Mercurial's path) + churn = # (this extension will get loaded from the file specified) myfeature = ~/.hgext/myfeature.py @@ -1796,6 +1837,13 @@ By default, the first bundle advertised by the server is used. +``color`` + String: when to use to colorize output. possible value are auto, always, + never, or debug (default: never). 'auto' will use color whenever it seems + possible. See :hg:`help color` for details. + + (in addition a boolean can be used in place always/never) + ``commitsubrepos`` Whether to commit modified subrepositories when committing the parent repository. If False and one subrepository has uncommitted diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/filesets.txt --- a/mercurial/help/filesets.txt Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/help/filesets.txt Fri Mar 24 08:37:26 2017 -0700 @@ -69,6 +69,10 @@ hg revert "set:copied() and binary() and size('>1M')" +- Revert files that were added to the working directory:: + + hg revert "set:revs('wdir()', added())" + - Remove files listed in foo.lst that contain the letter a or b:: hg remove "set: 'listfile:foo.lst' and (**a* or **b*)" diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/internals/censor.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/help/internals/censor.txt Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,22 @@ +The censor system allows retroactively removing content from +files. Actually censoring a node requires using the censor extension, +but the functionality for handling censored nodes is partially in core. + +Censored nodes in a filelog have the flag ``REVIDX_ISCENSORED`` set, +and the contents of the censored node are replaced with a censor +tombstone. For historical reasons, the tombstone is packed in the +filelog metadata field ``censored``. This allows censored nodes to be +(mostly) safely transmitted through old formats like changegroup +versions 1 and 2. When using changegroup formats older than 3, the +receiver is required to re-add the ``REVIDX_ISCENSORED`` flag when +storing the revision. This depends on the ``censored`` metadata key +never being used for anything other than censoring revisions, which is +true as of January 2017. Note that the revlog flag is the +authoritative marker of a censored node: the tombstone should only be +consulted when looking for a reason a node was censored or when revlog +flags are unavailable as mentioned above. + +The tombstone data is a free-form string. It's expected that users of +censor will want to record the reason for censoring a node in the +tombstone. Censored nodes must be able to fit in the size of the +content being censored. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/internals/changegroups.txt --- a/mercurial/help/internals/changegroups.txt Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/help/internals/changegroups.txt Fri Mar 24 08:37:26 2017 -0700 @@ -1,35 +1,49 @@ Changegroups are representations of repository revlog data, specifically -the changelog, manifest, and filelogs. +the changelog data, root/flat manifest data, treemanifest data, and +filelogs. There are 3 versions of changegroups: ``1``, ``2``, and ``3``. From a -high-level, versions ``1`` and ``2`` are almost exactly the same, with -the only difference being a header on entries in the changeset -segment. Version ``3`` adds support for exchanging treemanifests and -includes revlog flags in the delta header. +high-level, versions ``1`` and ``2`` are almost exactly the same, with the +only difference being an additional item in the *delta header*. Version +``3`` adds support for revlog flags in the *delta header* and optionally +exchanging treemanifests (enabled by setting an option on the +``changegroup`` part in the bundle2). -Changegroups consists of 3 logical segments:: +Changegroups when not exchanging treemanifests consist of 3 logical +segments:: +---------------------------------+ | | | | | changeset | manifest | filelogs | | | | | + | | | | +---------------------------------+ +When exchanging treemanifests, there are 4 logical segments:: + + +-------------------------------------------------+ + | | | | | + | changeset | root | treemanifests | filelogs | + | | manifest | | | + | | | | | + +-------------------------------------------------+ + The principle building block of each segment is a *chunk*. A *chunk* is a framed piece of data:: +---------------------------------------+ | | | | length | data | - | (32 bits) | bytes | + | (4 bytes) | ( bytes) | | | | +---------------------------------------+ -Each chunk starts with a 32-bit big-endian signed integer indicating -the length of the raw data that follows. +All integers are big-endian signed integers. Each chunk starts with a 32-bit +integer indicating the length of the entire chunk (including the length field +itself). -There is a special case chunk that has 0 length (``0x00000000``). We -call this an *empty chunk*. +There is a special case chunk that has a value of 0 for the length +(``0x00000000``). We call this an *empty chunk*. Delta Groups ============ @@ -43,26 +57,27 @@ +------------------------------------------------------------------------+ | | | | | | | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 | - | (32 bits) | (various) | (32 bits) | (various) | (32 bits) | + | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) | | | | | | | - +------------------------------------------------------------+-----------+ + +------------------------------------------------------------------------+ Each *chunk*'s data consists of the following:: - +-----------------------------------------+ - | | | | - | delta header | mdiff header | delta | - | (various) | (12 bytes) | (various) | - | | | | - +-----------------------------------------+ + +---------------------------------------+ + | | | + | delta header | delta data | + | (various by version) | (various) | + | | | + +---------------------------------------+ -The *length* field is the byte length of the remaining 3 logical pieces -of data. The *delta* is a diff from an existing entry in the changelog. +The *delta data* is a series of *delta*s that describe a diff from an existing +entry (either that the recipient already has, or previously specified in the +bundlei/changegroup). The *delta header* is different between versions ``1``, ``2``, and ``3`` of the changegroup format. -Version 1:: +Version 1 (headerlen=80):: +------------------------------------------------------+ | | | | | @@ -71,7 +86,7 @@ | | | | | +------------------------------------------------------+ -Version 2:: +Version 2 (headerlen=100):: +------------------------------------------------------------------+ | | | | | | @@ -80,30 +95,35 @@ | | | | | | +------------------------------------------------------------------+ -Version 3:: +Version 3 (headerlen=102):: +------------------------------------------------------------------------------+ | | | | | | | - | node | p1 node | p2 node | base node | link node | flags | + | node | p1 node | p2 node | base node | link node | flags | | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) | | | | | | | | +------------------------------------------------------------------------------+ -The *mdiff header* consists of 3 32-bit big-endian signed integers -describing offsets at which to apply the following delta content:: +The *delta data* consists of ``chunklen - 4 - headerlen`` bytes, which contain a +series of *delta*s, densely packed (no separators). These deltas describe a diff +from an existing entry (either that the recipient already has, or previously +specified in the bundle/changegroup). The format is described more fully in +``hg help internals.bdiff``, but briefly:: - +-------------------------------------+ - | | | | - | offset | old length | new length | - | (32 bits) | (32 bits) | (32 bits) | - | | | | - +-------------------------------------+ + +---------------------------------------------------------------+ + | | | | | + | start offset | end offset | new length | content | + | (4 bytes) | (4 bytes) | (4 bytes) | ( bytes) | + | | | | | + +---------------------------------------------------------------+ + +Please note that the length field in the delta data does *not* include itself. In version 1, the delta is always applied against the previous node from the changegroup or the first parent if this is the first entry in the changegroup. -In version 2, the delta base node is encoded in the entry in the +In version 2 and up, the delta base node is encoded in the entry in the changegroup. This allows the delta to be expressed against any parent, which can result in smaller deltas and more efficient encoding of data. @@ -111,43 +131,58 @@ ================= The *changeset segment* consists of a single *delta group* holding -changelog data. It is followed by an *empty chunk* to denote the -boundary to the *manifests segment*. +changelog data. The *empty chunk* at the end of the *delta group* denotes +the boundary to the *manifest segment*. Manifest Segment ================ -The *manifest segment* consists of a single *delta group* holding -manifest data. It is followed by an *empty chunk* to denote the boundary -to the *filelogs segment*. +The *manifest segment* consists of a single *delta group* holding manifest +data. If treemanifests are in use, it contains only the manifest for the +root directory of the repository. Otherwise, it contains the entire +manifest data. The *empty chunk* at the end of the *delta group* denotes +the boundary to the next segment (either the *treemanifests segment* or the +*filelogs segment*, depending on version and the request options). + +Treemanifests Segment +--------------------- + +The *treemanifests segment* only exists in changegroup version ``3``, and +only if the 'treemanifest' param is part of the bundle2 changegroup part +(it is not possible to use changegroup version 3 outside of bundle2). +Aside from the filenames in the *treemanifests segment* containing a +trailing ``/`` character, it behaves identically to the *filelogs segment* +(see below). The final sub-segment is followed by an *empty chunk* (logically, +a sub-segment with filename size 0). This denotes the boundary to the +*filelogs segment*. Filelogs Segment ================ -The *filelogs* segment consists of multiple sub-segments, each +The *filelogs segment* consists of multiple sub-segments, each corresponding to an individual file whose data is being described:: - +--------------------------------------+ - | | | | | - | filelog0 | filelog1 | filelog2 | ... | - | | | | | - +--------------------------------------+ + +--------------------------------------------------+ + | | | | | | + | filelog0 | filelog1 | filelog2 | ... | 0x0 | + | | | | | (4 bytes) | + | | | | | | + +--------------------------------------------------+ -In version ``3`` of the changegroup format, filelogs may include -directory logs when treemanifests are in use. directory logs are -identified by having a trailing '/' on their filename (see below). - -The final filelog sub-segment is followed by an *empty chunk* to denote -the end of the segment and the overall changegroup. +The final filelog sub-segment is followed by an *empty chunk* (logically, +a sub-segment with filename size 0). This denotes the end of the segment +and of the overall changegroup. Each filelog sub-segment consists of the following:: - +------------------------------------------+ - | | | | - | filename size | filename | delta group | - | (32 bits) | (various) | (various) | - | | | | - +------------------------------------------+ + +------------------------------------------------------+ + | | | | + | filename length | filename | delta group | + | (4 bytes) | ( bytes) | (various) | + | | | | + +------------------------------------------------------+ That is, a *chunk* consisting of the filename (not terminated or padded) -followed by N chunks constituting the *delta group* for this file. +followed by N chunks constituting the *delta group* for this file. The +*empty chunk* at the end of each *delta group* denotes the boundary to the +next filelog sub-segment. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/internals/requirements.txt --- a/mercurial/help/internals/requirements.txt Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/help/internals/requirements.txt Fri Mar 24 08:37:26 2017 -0700 @@ -55,6 +55,17 @@ The requirement was added in Mercurial 1.3 (released July 2009). +relshared +========= + +Derivative of ``shared``; the location of the store is relative to the +store of this repository. + +This requirement is set when a repository is created via :hg:`share` +using the ``--relative`` option. + +The requirement was added in Mercurial 4.2 (released May 2017). + dotencode ========= diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/internals/revlogs.txt --- a/mercurial/help/internals/revlogs.txt Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/help/internals/revlogs.txt Fri Mar 24 08:37:26 2017 -0700 @@ -108,9 +108,9 @@ 16-19 (4 bytes) Base or previous revision this revision's delta was produced against. - -1 means this revision holds full text (as opposed to a delta). - For generaldelta repos, this is the previous revision in the delta - chain. For non-generaldelta repos, this is the base or first + This revision holds full text (as opposed to a delta) if it points to + itself. For generaldelta repos, this is the previous revision in the + delta chain. For non-generaldelta repos, this is the base or first revision in the delta chain. 20-23 (4 bytes) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/pager.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/help/pager.txt Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,35 @@ +Some Mercurial commands produce a lot of output, and Mercurial will +attempt to use a pager to make those commands more pleasant. + +To set the pager that should be used, set the application variable:: + + [pager] + pager = less -FRX + +If no pager is set, the pager extensions uses the environment variable +$PAGER. If neither pager.pager, nor $PAGER is set, a default pager +will be used, typically `more`. + +You can disable the pager for certain commands by adding them to the +pager.ignore list:: + + [pager] + ignore = version, help, update + +To ignore global commands like :hg:`version` or :hg:`help`, you have +to specify them in your user configuration file. + +To control whether the pager is used at all for an individual command, +you can use --pager=:: + + - use as needed: `auto`. + - require the pager: `yes` or `on`. + - suppress the pager: `no` or `off` (any unrecognized value + will also work). + +To globally turn off all attempts to use a pager, set:: + + [pager] + enable = false + +which will prevent the pager from running. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/help/patterns.txt --- a/mercurial/help/patterns.txt Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/help/patterns.txt Fri Mar 24 08:37:26 2017 -0700 @@ -13,7 +13,10 @@ To use a plain path name without any pattern matching, start it with ``path:``. These path names must completely match starting at the -current repository root. +current repository root, and when the path points to a directory, it is matched +recursively. To match all files in a directory non-recursively (not including +any files in subdirectories), ``rootfilesin:`` can be used, specifying an +absolute path (relative to the repository root). To use an extended glob, start a name with ``glob:``. Globs are rooted at the current directory; a glob such as ``*.c`` will only match files @@ -39,12 +42,15 @@ All patterns, except for ``glob:`` specified in command line (not for ``-I`` or ``-X`` options), can match also against directories: files under matched directories are treated as matched. +For ``-I`` and ``-X`` options, ``glob:`` will match directories recursively. Plain examples:: - path:foo/bar a name bar in a directory named foo in the root - of the repository - path:path:name a file or directory named "path:name" + path:foo/bar a name bar in a directory named foo in the root + of the repository + path:path:name a file or directory named "path:name" + rootfilesin:foo/bar the files in a directory called foo/bar, but not any files + in its subdirectories and not a file bar in directory foo Glob examples:: @@ -52,6 +58,8 @@ *.c any name ending in ".c" in the current directory **.c any name ending in ".c" in any subdirectory of the current directory including itself. + foo/* any file in directory foo plus all its subdirectories, + recursively foo/*.c any name ending in ".c" in the directory foo foo/**.c any name ending in ".c" in any subdirectory of foo including itself. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/hg.py --- a/mercurial/hg.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/hg.py Fri Mar 24 08:37:26 2017 -0700 @@ -40,6 +40,7 @@ url, util, verify as verifymod, + vfs as vfsmod, ) release = lock.release @@ -195,7 +196,8 @@ return '' return os.path.basename(os.path.normpath(path)) -def share(ui, source, dest=None, update=True, bookmarks=True, defaultpath=None): +def share(ui, source, dest=None, update=True, bookmarks=True, defaultpath=None, + relative=False): '''create a shared repository''' if not islocal(source): @@ -218,8 +220,8 @@ sharedpath = srcrepo.sharedpath # if our source is already sharing - destwvfs = scmutil.vfs(dest, realpath=True) - destvfs = scmutil.vfs(os.path.join(destwvfs.base, '.hg'), realpath=True) + destwvfs = vfsmod.vfs(dest, realpath=True) + destvfs = vfsmod.vfs(os.path.join(destwvfs.base, '.hg'), realpath=True) if destvfs.lexists(): raise error.Abort(_('destination already exists')) @@ -235,7 +237,16 @@ if inst.errno != errno.ENOENT: raise - requirements += 'shared\n' + if relative: + try: + sharedpath = os.path.relpath(sharedpath, destvfs.base) + requirements += 'relshared\n' + except IOError as e: + raise error.Abort(_('cannot calculate relative path'), + hint=str(e)) + else: + requirements += 'shared\n' + destvfs.write('requires', requirements) destvfs.write('sharedpath', sharedpath) @@ -302,8 +313,8 @@ else: ui.progress(topic, pos + num) srcpublishing = srcrepo.publishing() - srcvfs = scmutil.vfs(srcrepo.sharedpath) - dstvfs = scmutil.vfs(destpath) + srcvfs = vfsmod.vfs(srcrepo.sharedpath) + dstvfs = vfsmod.vfs(destpath) for f in srcrepo.store.copylist(): if srcpublishing and f.endswith('phaseroots'): continue @@ -359,7 +370,7 @@ if e.errno != errno.EEXIST: raise - poolvfs = scmutil.vfs(pooldir) + poolvfs = vfsmod.vfs(pooldir) basename = os.path.basename(sharepath) with lock.lock(poolvfs, '%s.lock' % basename): @@ -464,7 +475,7 @@ if not dest: raise error.Abort(_("empty destination path is not valid")) - destvfs = scmutil.vfs(dest, expandpath=True) + destvfs = vfsmod.vfs(dest, expandpath=True) if destvfs.lexists(): if not destvfs.isdir(): raise error.Abort(_("destination '%s' already exists") % dest) @@ -548,7 +559,7 @@ destlock = copystore(ui, srcrepo, destpath) # copy bookmarks over - srcbookmarks = srcrepo.join('bookmarks') + srcbookmarks = srcrepo.vfs.join('bookmarks') dstbookmarks = os.path.join(destpath, 'bookmarks') if os.path.exists(srcbookmarks): util.copyfile(srcbookmarks, dstbookmarks) @@ -556,7 +567,7 @@ # Recomputing branch cache might be slow on big repos, # so just copy it def copybranchcache(fname): - srcbranchcache = srcrepo.join('cache/%s' % fname) + srcbranchcache = srcrepo.vfs.join('cache/%s' % fname) dstbranchcache = os.path.join(dstcachedir, fname) if os.path.exists(srcbranchcache): if not os.path.exists(dstcachedir): @@ -602,14 +613,10 @@ else: stream = None # internal config: ui.quietbookmarkmove - quiet = local.ui.backupconfig('ui', 'quietbookmarkmove') - try: - local.ui.setconfig( - 'ui', 'quietbookmarkmove', True, 'clone') + overrides = {('ui', 'quietbookmarkmove'): True} + with local.ui.configoverride(overrides, 'clone'): exchange.pull(local, srcpeer, revs, streamclonerequested=stream) - finally: - local.ui.restoreconfig(quiet) elif srcrepo: exchange.push(srcrepo, destpeer, revs=revs, bookmarks=srcrepo._bookmarks.keys()) @@ -681,18 +688,19 @@ repo.ui.status(_("%d files updated, %d files merged, " "%d files removed, %d files unresolved\n") % stats) -def updaterepo(repo, node, overwrite): +def updaterepo(repo, node, overwrite, updatecheck=None): """Update the working directory to node. When overwrite is set, changes are clobbered, merged else returns stats (see pydoc mercurial.merge.applyupdates)""" return mergemod.update(repo, node, False, overwrite, - labels=['working copy', 'destination']) + labels=['working copy', 'destination'], + updatecheck=updatecheck) -def update(repo, node, quietempty=False): - """update the working directory to node, merging linear changes""" - stats = updaterepo(repo, node, False) +def update(repo, node, quietempty=False, updatecheck=None): + """update the working directory to node""" + stats = updaterepo(repo, node, False, updatecheck=updatecheck) _showstats(repo, stats, quietempty) if stats[3]: repo.ui.status(_("use 'hg resolve' to retry unresolved file merges\n")) @@ -704,7 +712,7 @@ def clean(repo, node, show_stats=True, quietempty=False): """forcibly switch the working directory to node, clobbering changes""" stats = updaterepo(repo, node, True) - util.unlinkpath(repo.join('graftstate'), ignoremissing=True) + repo.vfs.unlinkpath('graftstate', ignoremissing=True) if show_stats: _showstats(repo, stats, quietempty) return stats[3] > 0 @@ -712,7 +720,7 @@ # naming conflict in updatetotally() _clean = clean -def updatetotally(ui, repo, checkout, brev, clean=False, check=False): +def updatetotally(ui, repo, checkout, brev, clean=False, updatecheck=None): """Update the working directory with extra care for non-file components This takes care of non-file components below: @@ -724,22 +732,38 @@ :checkout: to which revision the working directory is updated :brev: a name, which might be a bookmark to be activated after updating :clean: whether changes in the working directory can be discarded - :check: whether changes in the working directory should be checked + :updatecheck: how to deal with a dirty working directory + + Valid values for updatecheck are (None => linear): + + * abort: abort if the working directory is dirty + * none: don't check (merge working directory changes into destination) + * linear: check that update is linear before merging working directory + changes into destination + * noconflict: check that the update does not result in file merges This returns whether conflict is detected at updating or not. """ + if updatecheck is None: + updatecheck = ui.config('experimental', 'updatecheck') + if updatecheck not in ('abort', 'none', 'linear', 'noconflict'): + # If not configured, or invalid value configured + updatecheck = 'linear' with repo.wlock(): movemarkfrom = None warndest = False if checkout is None: - updata = destutil.destupdate(repo, clean=clean, check=check) + updata = destutil.destupdate(repo, clean=clean) checkout, movemarkfrom, brev = updata warndest = True if clean: ret = _clean(repo, checkout) else: - ret = _update(repo, checkout) + if updatecheck == 'abort': + cmdutil.bailifchanged(repo, merge=False) + updatecheck = 'none' + ret = _update(repo, checkout, updatecheck=updatecheck) if not ret and movemarkfrom: if movemarkfrom == repo['.'].node(): @@ -802,7 +826,7 @@ if not chlist: ui.status(_("no changes found\n")) return subreporecurse() - + ui.pager('incoming') displayer = cmdutil.show_changeset(ui, other, opts, buffered) displaychlist(other, chlist, displayer) displayer.close() @@ -870,6 +894,7 @@ if opts.get('newest_first'): o.reverse() + ui.pager('outgoing') displayer = cmdutil.show_changeset(ui, repo, opts) count = 0 for n in o: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/hgweb/common.py --- a/mercurial/hgweb/common.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/hgweb/common.py Fri Mar 24 08:37:26 2017 -0700 @@ -91,11 +91,13 @@ class ErrorResponse(Exception): - def __init__(self, code, message=None, headers=[]): + def __init__(self, code, message=None, headers=None): if message is None: message = _statusmessage(code) Exception.__init__(self, message) self.code = code + if headers is None: + headers = [] self.headers = headers class continuereader(object): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/hgweb/hgwebdir_mod.py --- a/mercurial/hgweb/hgwebdir_mod.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/hgweb/hgwebdir_mod.py Fri Mar 24 08:37:26 2017 -0700 @@ -254,13 +254,21 @@ return [] # top-level index - elif not virtual: + + repos = dict(self.repos) + + if not virtual or (virtual == 'index' and virtual not in repos): req.respond(HTTP_OK, ctype) return self.makeindex(req, tmpl) # nested indexes and hgwebs - repos = dict(self.repos) + if virtual.endswith('/index') and virtual not in repos: + subdir = virtual[:-len('index')] + if any(r.startswith(subdir) for r in repos): + req.respond(HTTP_OK, ctype) + return self.makeindex(req, tmpl, subdir) + virtualrepo = virtual while virtualrepo: real = repos.get(virtualrepo) @@ -352,8 +360,7 @@ pass parts = [name] - if 'PATH_INFO' in req.env: - parts.insert(0, req.env['PATH_INFO'].rstrip('/')) + parts.insert(0, '/' + subdir.rstrip('/')) if req.env['SCRIPT_NAME']: parts.insert(0, req.env['SCRIPT_NAME']) url = re.sub(r'/+', '/', '/'.join(parts) + '/') diff -r ed5b25874d99 -r 4baf79a77afa mercurial/hgweb/webcommands.py --- a/mercurial/hgweb/webcommands.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/hgweb/webcommands.py Fri Mar 24 08:37:26 2017 -0700 @@ -32,7 +32,9 @@ error, graphmod, revset, + revsetlang, scmutil, + smartset, templatefilters, templater, util, @@ -238,20 +240,20 @@ revdef = 'reverse(%s)' % query try: - tree = revset.parse(revdef) + tree = revsetlang.parse(revdef) except error.ParseError: # can't parse to a revset tree return MODE_KEYWORD, query - if revset.depth(tree) <= 2: + if revsetlang.depth(tree) <= 2: # no revset syntax used return MODE_KEYWORD, query if any((token, (value or '')[:3]) == ('string', 're:') - for token, value, pos in revset.tokenize(revdef)): + for token, value, pos in revsetlang.tokenize(revdef)): return MODE_KEYWORD, query - funcsused = revset.funcsused(tree) + funcsused = revsetlang.funcsused(tree) if not funcsused.issubset(revset.safesymbols): return MODE_KEYWORD, query @@ -752,13 +754,14 @@ if fctx is not None: path = fctx.path() ctx = fctx.changectx() + basectx = ctx.p1() parity = paritygen(web.stripecount) style = web.config('web', 'style', 'paper') if 'style' in req.form: style = req.form['style'][0] - diffs = webutil.diffs(web.repo, tmpl, ctx, None, [path], parity, style) + diffs = webutil.diffs(web.repo, tmpl, ctx, basectx, [path], parity, style) if fctx is not None: rename = webutil.renamelink(fctx) ctx = fctx @@ -1148,7 +1151,7 @@ # We have to feed a baseset to dagwalker as it is expecting smartset # object. This does not have a big impact on hgweb performance itself # since hgweb graphing code is not itself lazy yet. - dag = graphmod.dagwalker(web.repo, revset.baseset(revs)) + dag = graphmod.dagwalker(web.repo, smartset.baseset(revs)) # As we said one line above... not lazy. tree = list(graphmod.colored(dag, web.repo)) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/hgweb/webutil.py --- a/mercurial/hgweb/webutil.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/hgweb/webutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -72,6 +72,8 @@ """return True if any revision to navigate over""" return self._first() is not None + __bool__ = __nonzero__ + def _first(self): """return the minimum non-filtered changeset or None""" try: @@ -142,7 +144,9 @@ return hex(self._changelog.node(self._revlog.linkrev(rev))) class _siblings(object): - def __init__(self, siblings=[], hiderev=None): + def __init__(self, siblings=None, hiderev=None): + if siblings is None: + siblings = [] self.siblings = [s for s in siblings if s.node() != nullid] if len(self.siblings) == 1 and self.siblings[0].rev() == hiderev: self.siblings = [] @@ -412,16 +416,9 @@ def diffs(repo, tmpl, ctx, basectx, files, parity, style): - def countgen(): - start = 1 - while True: - yield start - start += 1 - - blockcount = countgen() - def prettyprintlines(diff, blockno): - for lineno, l in enumerate(diff.splitlines(True)): - difflineno = "%d.%d" % (blockno, lineno + 1) + def prettyprintlines(lines, blockno): + for lineno, l in enumerate(lines, 1): + difflineno = "%d.%d" % (blockno, lineno) if l.startswith('+'): ltype = "difflineplus" elif l.startswith('-'): @@ -432,7 +429,7 @@ ltype = "diffline" yield tmpl(ltype, line=l, - lineno=lineno + 1, + lineno=lineno, lineid="l%s" % difflineno, linenumber="% 8s" % difflineno) @@ -442,29 +439,19 @@ m = match.always(repo.root, repo.getcwd()) diffopts = patch.diffopts(repo.ui, untrusted=True) - if basectx is None: - parents = ctx.parents() - if parents: - node1 = parents[0].node() - else: - node1 = nullid - else: - node1 = basectx.node() + node1 = basectx.node() node2 = ctx.node() - block = [] - for chunk in patch.diff(repo, node1, node2, m, opts=diffopts): - if chunk.startswith('diff') and block: - blockno = next(blockcount) + diffhunks = patch.diffhunks(repo, node1, node2, m, opts=diffopts) + for blockno, (header, hunks) in enumerate(diffhunks, 1): + if style != 'raw': + header = header[1:] + lines = [h + '\n' for h in header] + for hunkrange, hunklines in hunks: + lines.extend(hunklines) + if lines: yield tmpl('diffblock', parity=next(parity), blockno=blockno, - lines=prettyprintlines(''.join(block), blockno)) - block = [] - if chunk.startswith('diff') and style != 'raw': - chunk = ''.join(chunk.splitlines(True)[1:]) - block.append(chunk) - blockno = next(blockcount) - yield tmpl('diffblock', parity=next(parity), blockno=blockno, - lines=prettyprintlines(''.join(block), blockno)) + lines=prettyprintlines(lines, blockno)) def compare(tmpl, context, leftlines, rightlines): '''Generator function that provides side-by-side comparison data.''' diff -r ed5b25874d99 -r 4baf79a77afa mercurial/hook.py --- a/mercurial/hook.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/hook.py Fri Mar 24 08:37:26 2017 -0700 @@ -9,7 +9,6 @@ import os import sys -import time from .i18n import _ from . import ( @@ -88,7 +87,7 @@ % (hname, funcname)) ui.note(_("calling hook %s: %s\n") % (hname, funcname)) - starttime = time.time() + starttime = util.timer() try: r = obj(ui=ui, repo=repo, hooktype=name, **args) @@ -106,7 +105,7 @@ ui.traceback() return True, True finally: - duration = time.time() - starttime + duration = util.timer() - starttime ui.log('pythonhook', 'pythonhook-%s: %s finished in %0.2f seconds\n', name, funcname, duration) if r: @@ -118,7 +117,7 @@ def _exthook(ui, repo, name, cmd, args, throw): ui.note(_("running hook %s: %s\n") % (name, cmd)) - starttime = time.time() + starttime = util.timer() env = {} # make in-memory changes visible to external process @@ -143,9 +142,9 @@ cwd = repo.root else: cwd = pycompat.getcwd() - r = ui.system(cmd, environ=env, cwd=cwd) + r = ui.system(cmd, environ=env, cwd=cwd, blockedtag='exthook-%s' % (name,)) - duration = time.time() - starttime + duration = util.timer() - starttime ui.log('exthook', 'exthook-%s: %s finished in %0.2f seconds\n', name, cmd, duration) if r: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/httpclient/__init__.py --- a/mercurial/httpclient/__init__.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/httpclient/__init__.py Fri Mar 24 08:37:26 2017 -0700 @@ -631,7 +631,7 @@ self.close() self._connect(pheaders) - def request(self, method, path, body=None, headers={}, + def request(self, method, path, body=None, headers=None, expect_continue=False): """Send a request to the server. @@ -642,6 +642,8 @@ available. Use the `getresponse()` method to retrieve the response. """ + if headers is None: + headers = {} method = _ensurebytes(method) path = _ensurebytes(path) if self.busy(): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/httpconnection.py --- a/mercurial/httpconnection.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/httpconnection.py Fri Mar 24 08:37:26 2017 -0700 @@ -44,10 +44,10 @@ self._total = self.length // 1024 * 2 def read(self, *args, **kwargs): - try: - ret = self._data.read(*args, **kwargs) - except EOFError: + ret = self._data.read(*args, **kwargs) + if not ret: self.ui.progress(_('sending'), None) + return ret self._pos += len(ret) # We pass double the max for total because we currently have # to send the bundle twice in the case of a server that @@ -67,13 +67,13 @@ # moved here from url.py to avoid a cycle def readauthforuri(ui, uri, user): # Read configuration - config = dict() + groups = {} for key, val in ui.configitems('auth'): if '.' not in key: ui.warn(_("ignoring invalid [auth] key '%s'\n") % key) continue group, setting = key.rsplit('.', 1) - gdict = config.setdefault(group, dict()) + gdict = groups.setdefault(group, {}) if setting in ('username', 'cert', 'key'): val = util.expandpath(val) gdict[setting] = val @@ -83,7 +83,7 @@ bestuser = None bestlen = 0 bestauth = None - for group, auth in config.iteritems(): + for group, auth in groups.iteritems(): if user and user != auth.get('username', user): # If a username was set in the URI, the entry username # must either match it or be unset diff -r ed5b25874d99 -r 4baf79a77afa mercurial/httppeer.py --- a/mercurial/httppeer.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/httppeer.py Fri Mar 24 08:37:26 2017 -0700 @@ -20,6 +20,7 @@ bundle2, error, httpconnection, + pycompat, statichttprepo, url, util, @@ -327,7 +328,7 @@ try: # dump bundle to disk fd, filename = tempfile.mkstemp(prefix="hg-bundle-", suffix=".hg") - fh = os.fdopen(fd, "wb") + fh = os.fdopen(fd, pycompat.sysstr("wb")) d = fp.read(4096) while d: fh.write(d) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/i18n.py --- a/mercurial/i18n.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/i18n.py Fri Mar 24 08:37:26 2017 -0700 @@ -21,7 +21,7 @@ if getattr(sys, 'frozen', None) is not None: module = pycompat.sysexecutable else: - module = __file__ + module = pycompat.fsencode(__file__) try: unicode diff -r ed5b25874d99 -r 4baf79a77afa mercurial/keepalive.py --- a/mercurial/keepalive.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/keepalive.py Fri Mar 24 08:37:26 2017 -0700 @@ -310,14 +310,16 @@ try: if req.has_data(): data = req.get_data() - h.putrequest('POST', req.get_selector(), **skipheaders) + h.putrequest( + req.get_method(), req.get_selector(), **skipheaders) if 'content-type' not in headers: h.putheader('Content-type', 'application/x-www-form-urlencoded') if 'content-length' not in headers: h.putheader('Content-length', '%d' % len(data)) else: - h.putrequest('GET', req.get_selector(), **skipheaders) + h.putrequest( + req.get_method(), req.get_selector(), **skipheaders) except socket.error as err: raise urlerr.urlerror(err) for k, v in headers.items(): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/localrepo.py --- a/mercurial/localrepo.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/localrepo.py Fri Mar 24 08:37:26 2017 -0700 @@ -28,6 +28,7 @@ bundle2, changegroup, changelog, + color, context, dirstate, dirstateguard, @@ -48,14 +49,18 @@ peer, phases, pushkey, + pycompat, repoview, revset, + revsetlang, scmutil, store, subrepo, tags as tagsmod, transaction, + txnutil, util, + vfs as vfsmod, ) release = lockmod.release @@ -66,6 +71,8 @@ """All filecache usage on repo are done for logic that should be unfiltered """ + def join(self, obj, fname): + return obj.vfs.join(fname) def __get__(self, repo, type=None): if repo is None: return self @@ -113,7 +120,9 @@ class localpeer(peer.peerrepository): '''peer for a local repo; reflects only the most recent API''' - def __init__(self, repo, caps=moderncaps): + def __init__(self, repo, caps=None): + if caps is None: + caps = moderncaps.copy() peer.peerrepository.__init__(self) self._repo = repo.filtered('served') self.ui = repo.ui @@ -241,7 +250,7 @@ supportedformats = set(('revlogv1', 'generaldelta', 'treemanifest', 'manifestv2')) _basesupported = supportedformats | set(('store', 'fncache', 'shared', - 'dotencode')) + 'relshared', 'dotencode')) openerreqs = set(('revlogv1', 'generaldelta', 'treemanifest', 'manifestv2')) filtername = None @@ -251,16 +260,21 @@ def __init__(self, baseui, path, create=False): self.requirements = set() - self.wvfs = scmutil.vfs(path, expandpath=True, realpath=True) - self.wopener = self.wvfs + # wvfs: rooted at the repository root, used to access the working copy + self.wvfs = vfsmod.vfs(path, expandpath=True, realpath=True) + # vfs: rooted at .hg, used to access repo files outside of .hg/store + self.vfs = None + # svfs: usually rooted at .hg/store, used to access repository history + # If this is a shared repository, this vfs may point to another + # repository's .hg/store directory. + self.svfs = None self.root = self.wvfs.base self.path = self.wvfs.join(".hg") self.origroot = path self.auditor = pathutil.pathauditor(self.root, self._checknested) self.nofsauditor = pathutil.pathauditor(self.root, self._checknested, realfs=False) - self.vfs = scmutil.vfs(self.path) - self.opener = self.vfs + self.vfs = vfsmod.vfs(self.path) self.baseui = baseui self.ui = baseui.copy() self.ui.copy = baseui.copy # prevent copying repo configuration @@ -269,8 +283,8 @@ # This list it to be filled by extension during repo setup self._phasedefaults = [] try: - self.ui.readconfig(self.join("hgrc"), self.root) - extensions.loadall(self.ui) + self.ui.readconfig(self.vfs.join("hgrc"), self.root) + self._loadextensions() except IOError: pass @@ -283,6 +297,7 @@ setupfunc(self.ui, self.supported) else: self.supported = self._basesupported + color.setup(self.ui) # Add compression engines. for name in util.compengines: @@ -321,8 +336,10 @@ self.sharedpath = self.path try: - vfs = scmutil.vfs(self.vfs.read("sharedpath").rstrip('\n'), - realpath=True) + sharedpath = self.vfs.read("sharedpath").rstrip('\n') + if 'relshared' in self.requirements: + sharedpath = self.vfs.join(sharedpath) + vfs = vfsmod.vfs(sharedpath, realpath=True) s = vfs.base if not vfs.exists(): raise error.RepoError( @@ -333,7 +350,7 @@ raise self.store = store.store( - self.requirements, self.sharedpath, scmutil.vfs) + self.requirements, self.sharedpath, vfsmod.vfs) self.spath = self.store.path self.svfs = self.store.vfs self.sjoin = self.store.join @@ -368,9 +385,22 @@ # generic mapping between names and nodes self.names = namespaces.namespaces() + @property + def wopener(self): + self.ui.deprecwarn("use 'repo.wvfs' instead of 'repo.wopener'", '4.2') + return self.wvfs + + @property + def opener(self): + self.ui.deprecwarn("use 'repo.vfs' instead of 'repo.opener'", '4.2') + return self.vfs + def close(self): self._writecaches() + def _loadextensions(self): + extensions.loadall(self.ui) + def _writecaches(self): if self._revbranchcache: self._revbranchcache.write() @@ -461,9 +491,9 @@ """Return a filtered version of a repository""" # build a new class with the mixin and the current class # (possibly subclass of the repo) - class proxycls(repoview.repoview, self.unfiltered().__class__): + class filteredrepo(repoview.repoview, self.unfiltered().__class__): pass - return proxycls(self, name) + return filteredrepo(self, name) @repofilecache('bookmarks', 'bookmarks.current') def _bookmarks(self): @@ -509,10 +539,8 @@ @storecache('00changelog.i') def changelog(self): c = changelog.changelog(self.svfs) - if 'HG_PENDING' in encoding.environ: - p = encoding.environ['HG_PENDING'] - if p.startswith(self.root): - c.readpending('00changelog.i.a') + if txnutil.mayhavepending(self.root): + c.readpending('00changelog.i.a') return c def _constructmanifest(self): @@ -560,6 +588,8 @@ def __nonzero__(self): return True + __bool__ = __nonzero__ + def __len__(self): return len(self.changelog) @@ -570,15 +600,16 @@ '''Find revisions matching a revset. The revset is specified as a string ``expr`` that may contain - %-formatting to escape certain types. See ``revset.formatspec``. + %-formatting to escape certain types. See ``revsetlang.formatspec``. Revset aliases from the configuration are not expanded. To expand - user aliases, consider calling ``scmutil.revrange()``. + user aliases, consider calling ``scmutil.revrange()`` or + ``repo.anyrevs([expr], user=True)``. Returns a revset.abstractsmartset, which is a list-like interface that contains integer revisions. ''' - expr = revset.formatspec(expr, *args) + expr = revsetlang.formatspec(expr, *args) m = revset.match(None, expr) return m(self) @@ -594,6 +625,18 @@ for r in self.revs(expr, *args): yield self[r] + def anyrevs(self, specs, user=False): + '''Find revisions matching one of the given revsets. + + Revset aliases from the configuration are not expanded by default. To + expand user aliases, specify ``user=True``. + ''' + if user: + m = revset.matchany(self.ui, specs, repo=self) + else: + m = revset.matchany(None, specs) + return m(self) + def url(self): return 'file:' + self.root @@ -653,11 +696,11 @@ return try: - fp = self.wfile('.hgtags', 'rb+') + fp = self.wvfs('.hgtags', 'rb+') except IOError as e: if e.errno != errno.ENOENT: raise - fp = self.wfile('.hgtags', 'ab') + fp = self.wvfs('.hgtags', 'ab') else: prevtags = fp.read() @@ -896,6 +939,7 @@ return None def join(self, f, *insidef): + self.ui.deprecwarn("use 'repo.vfs.join' instead of 'repo.join'", '4.0') return self.vfs.join(os.path.join(f, *insidef)) def wjoin(self, f, *insidef): @@ -938,9 +982,12 @@ return self.dirstate.pathto(f, cwd) def wfile(self, f, mode='r'): + self.ui.deprecwarn("use 'repo.wvfs' instead of 'repo.wfile'", '4.2') return self.wvfs(f, mode) def _link(self, f): + self.ui.deprecwarn("use 'repo.wvfs.islink' instead of 'repo._link'", + '4.0') return self.wvfs.islink(f) def _loadfilter(self, filter): @@ -988,7 +1035,7 @@ self._datafilters[name] = filter def wread(self, filename): - if self._link(filename): + if self.wvfs.islink(filename): data = self.wvfs.readlink(filename) else: data = self.wvfs.read(filename) @@ -1038,7 +1085,8 @@ hint=_("run 'hg recover' to clean up transaction")) idbase = "%.40f#%f" % (random.random(), time.time()) - txnid = 'TXN:' + hashlib.sha1(idbase).hexdigest() + ha = hex(hashlib.sha1(idbase).digest()) + txnid = 'TXN:' + ha self.hook('pretxnopen', throw=True, txnname=desc, txnid=txnid) self._writejournal(desc) @@ -1053,7 +1101,7 @@ def validate(tr): """will run pre-closing hooks""" reporef().hook('pretxnclose', throw=True, - txnname=desc, **tr.hookargs) + txnname=desc, **pycompat.strkwargs(tr.hookargs)) def releasefn(tr, success): repo = reporef() if success: @@ -1094,7 +1142,7 @@ def hook(): reporef().hook('txnclose', throw=False, txnname=desc, - **hookargs) + **pycompat.strkwargs(hookargs)) reporef()._afterlock(hook) tr.addfinalize('txnclose-hook', txnclosehook) def txnaborthook(tr2): @@ -1270,7 +1318,7 @@ redundant one doesn't). ''' unfiltered = self.unfiltered() # all file caches are stored unfiltered - for k in self._filecache.keys(): + for k in list(self._filecache.keys()): # dirstate is invalidated separately in invalidatedirstate() if k == 'dirstate': continue @@ -1852,6 +1900,11 @@ listsubrepos) def heads(self, start=None): + if start is None: + cl = self.changelog + headrevs = reversed(cl.headrevs()) + return [cl.node(rev) for rev in headrevs] + heads = self.changelog.heads(start) # sort the output in rev descending order return sorted(heads, key=self.changelog.rev, reverse=True) @@ -1972,6 +2025,10 @@ renamefiles = [tuple(t) for t in files] def a(): for vfs, src, dest in renamefiles: + # if src and dest refer to a same file, vfs.rename is a no-op, + # leaving both src and dest on disk. delete dest to make sure + # the rename couldn't be such a no-op. + vfs.tryunlink(dest) try: vfs.rename(src, dest) except OSError: # journal file does not yet exist diff -r ed5b25874d99 -r 4baf79a77afa mercurial/lock.py --- a/mercurial/lock.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/lock.py Fri Mar 24 08:37:26 2017 -0700 @@ -9,15 +9,36 @@ import contextlib import errno +import os import socket import time import warnings from . import ( + encoding, error, + pycompat, util, ) +def _getlockprefix(): + """Return a string which is used to differentiate pid namespaces + + It's useful to detect "dead" processes and remove stale locks with + confidence. Typically it's just hostname. On modern linux, we include an + extra Linux-specific pid namespace identifier. + """ + result = socket.gethostname() + if pycompat.ispy3: + result = result.encode(pycompat.sysstr(encoding.encoding), 'replace') + if pycompat.sysplatform.startswith('linux'): + try: + result += '/%x' % os.stat('/proc/self/ns/pid').st_ino + except OSError as ex: + if ex.errno not in (errno.ENOENT, errno.EACCES, errno.ENOTDIR): + raise + return result + class lock(object): '''An advisory lock held by one process to control access to a set of files. Non-cooperating processes or incorrectly written scripts @@ -99,8 +120,8 @@ self.held += 1 return if lock._host is None: - lock._host = socket.gethostname() - lockname = '%s:%s' % (lock._host, self.pid) + lock._host = _getlockprefix() + lockname = '%s:%d' % (lock._host, self.pid) retry = 5 while not self.held and retry: retry -= 1 diff -r ed5b25874d99 -r 4baf79a77afa mercurial/mail.py --- a/mercurial/mail.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/mail.py Fri Mar 24 08:37:26 2017 -0700 @@ -353,4 +353,4 @@ except UnicodeDecodeError: pass uparts.append(part.decode('ISO-8859-1')) - return encoding.tolocal(u' '.join(uparts).encode('UTF-8')) + return encoding.unitolocal(u' '.join(uparts)) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/manifest.py --- a/mercurial/manifest.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/manifest.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,12 +7,15 @@ from __future__ import absolute_import -import array import heapq import os import struct from .i18n import _ +from .node import ( + bin, + hex, +) from . import ( error, mdiff, @@ -38,9 +41,9 @@ prev = l f, n = l.split('\0') if len(n) > 40: - yield f, revlog.bin(n[:40]), n[40:] + yield f, bin(n[:40]), n[40:] else: - yield f, revlog.bin(n), '' + yield f, bin(n), '' def _parsev2(data): metadataend = data.find('\n') @@ -124,6 +127,8 @@ zeropos = data.find('\x00', pos) return data[pos:zeropos] + __next__ = next + class lazymanifestiterentries(object): def __init__(self, lm): self.lm = lm @@ -147,8 +152,10 @@ self.pos += 1 return (data[pos:zeropos], hashval, flags) + __next__ = next + def unhexlify(data, extra, pos, length): - s = data[pos:pos + length].decode('hex') + s = bin(data[pos:pos + length]) if extra: s += chr(extra & 0xff) return s @@ -173,7 +180,7 @@ if not data: return [] pos = data.find("\n") - if pos == -1 or data[-1] != '\n': + if pos == -1 or data[-1:] != '\n': raise ValueError("Manifest did not end in a newline.") positions = [0] prev = data[:data.find('\x00')] @@ -251,8 +258,8 @@ return self.data[start:end] def __getitem__(self, key): - if not isinstance(key, str): - raise TypeError("getitem: manifest keys must be a string.") + if not isinstance(key, bytes): + raise TypeError("getitem: manifest keys must be a bytes.") needle = self.bsearch(key) if needle == -1: raise KeyError @@ -277,17 +284,17 @@ self.data = self.data[:cur] + '\x00' + self.data[cur + 1:] def __setitem__(self, key, value): - if not isinstance(key, str): - raise TypeError("setitem: manifest keys must be a string.") + if not isinstance(key, bytes): + raise TypeError("setitem: manifest keys must be a byte string.") if not isinstance(value, tuple) or len(value) != 2: raise TypeError("Manifest values must be a tuple of (node, flags).") hashval = value[0] - if not isinstance(hashval, str) or not 20 <= len(hashval) <= 22: - raise TypeError("node must be a 20-byte string") + if not isinstance(hashval, bytes) or not 20 <= len(hashval) <= 22: + raise TypeError("node must be a 20-byte byte string") flags = value[1] if len(hashval) == 22: hashval = hashval[:-1] - if not isinstance(flags, str) or len(flags) > 1: + if not isinstance(flags, bytes) or len(flags) > 1: raise TypeError("flags must a 0 or 1 byte string, got %r", flags) needle, found = self.bsearch2(key) if found: @@ -351,7 +358,7 @@ self.extradata = [] def _pack(self, d): - return d[0] + '\x00' + d[1][:20].encode('hex') + d[2] + '\n' + return d[0] + '\x00' + hex(d[1][:20]) + d[2] + '\n' def text(self): self._compact() @@ -427,6 +434,8 @@ # makes it easier for extensions to override. return len(self._lm) != 0 + __bool__ = __nonzero__ + def __setitem__(self, key, node): self._lm[key] = node, self.flags(key, '') @@ -445,8 +454,12 @@ def keys(self): return list(self.iterkeys()) - def filesnotin(self, m2): + def filesnotin(self, m2, match=None): '''Set of files in this manifest that are not in the other''' + if match: + m1 = self.matches(match) + m2 = m2.matches(match) + return m1.filesnotin(m2) diff = self.diff(m2) files = set(filepath for filepath, hashflags in diff.iteritems() @@ -523,7 +536,7 @@ m._lm = self._lm.filtercopy(match) return m - def diff(self, m2, clean=False): + def diff(self, m2, match=None, clean=False): '''Finds changes between the current manifest and m2. Args: @@ -538,6 +551,10 @@ the nodeid will be None and the flags will be the empty string. ''' + if match: + m1 = self.matches(match) + m2 = m2.matches(match) + return m1.diff(m2, clean=clean) return self._lm.diff(m2._lm, clean) def setflag(self, key, flag): @@ -620,8 +637,9 @@ else: # For large changes, it's much cheaper to just build the text and # diff it. - arraytext = array.array('c', self.text()) - deltatext = mdiff.textdiff(base, arraytext) + arraytext = bytearray(self.text()) + deltatext = mdiff.textdiff( + util.buffer(base), util.buffer(arraytext)) return arraytext, deltatext @@ -679,12 +697,12 @@ # for large addlist arrays, building a new array is cheaper # than repeatedly modifying the existing one currentposition = 0 - newaddlist = array.array('c') + newaddlist = bytearray() for start, end, content in x: newaddlist += addlist[currentposition:start] if content: - newaddlist += array.array('c', content) + newaddlist += bytearray(content) currentposition = end @@ -906,8 +924,13 @@ copy._copyfunc = self._copyfunc return copy - def filesnotin(self, m2): + def filesnotin(self, m2, match=None): '''Set of files in this manifest that are not in the other''' + if match: + m1 = self.matches(match) + m2 = m2.matches(match) + return m1.filesnotin(m2) + files = set() def _filesnotin(t1, t2): if t1._node == t2._node and not t1._dirty and not t2._dirty: @@ -1025,7 +1048,7 @@ ret._dirty = True return ret - def diff(self, m2, clean=False): + def diff(self, m2, match=None, clean=False): '''Finds changes between the current manifest and m2. Args: @@ -1040,6 +1063,10 @@ the nodeid will be None and the flags will be the empty string. ''' + if match: + m1 = self.matches(match) + m2 = m2.matches(match) + return m1.diff(m2, clean=clean) result = {} emptytree = treemanifest() def _diff(t1, t2): @@ -1132,7 +1159,12 @@ '''A revlog that stores manifest texts. This is responsible for caching the full-text manifest contents. ''' - def __init__(self, opener, dir='', dirlogcache=None): + def __init__(self, opener, dir='', dirlogcache=None, indexfile=None): + """Constructs a new manifest revlog + + `indexfile` - used by extensions to have two manifests at once, like + when transitioning between flatmanifeset and treemanifests. + """ # During normal operations, we expect to deal with not more than four # revs at a time (such as during commit --amend). When rebasing large # stacks of commits, the number can go up, hence the config knob below. @@ -1150,12 +1182,16 @@ self._fulltextcache = util.lrucachedict(cachesize) - indexfile = "00manifest.i" if dir: assert self._treeondisk, 'opts is %r' % opts if not dir.endswith('/'): dir = dir + '/' - indexfile = "meta/" + dir + "00manifest.i" + + if indexfile is None: + indexfile = '00manifest.i' + if dir: + indexfile = "meta/" + dir + indexfile + self._dir = dir # The dirlogcache is kept on the root manifest log if dir: @@ -1214,7 +1250,7 @@ else: text = m.text(self._usemanifestv2) n = self.addrevision(text, transaction, link, p1, p2) - arraytext = array.array('c', text) + arraytext = bytearray(text) if arraytext is not None: self.fulltextcache[n] = arraytext @@ -1224,7 +1260,7 @@ def _addtree(self, m, transaction, link, m1, m2, readtree): # If the manifest is unchanged compared to one parent, # don't write a new revision - if m.unmodifiedsince(m1) or m.unmodifiedsince(m2): + if self._dir != '' and (m.unmodifiedsince(m1) or m.unmodifiedsince(m2)): return m.node() def writesubtree(subm, subp1, subp2): sublog = self.dirlog(subm.dir()) @@ -1232,13 +1268,17 @@ readtree=readtree) m.writesubtrees(m1, m2, writesubtree) text = m.dirtext(self._usemanifestv2) - # Double-check whether contents are unchanged to one parent - if text == m1.dirtext(self._usemanifestv2): - n = m1.node() - elif text == m2.dirtext(self._usemanifestv2): - n = m2.node() - else: + n = None + if self._dir != '': + # Double-check whether contents are unchanged to one parent + if text == m1.dirtext(self._usemanifestv2): + n = m1.node() + elif text == m2.dirtext(self._usemanifestv2): + n = m2.node() + + if not n: n = self.addrevision(text, transaction, link, m1.node(), m2.node()) + # Save nodeid so parent manifest can calculate its nodeid m.setnode(n) return n @@ -1252,8 +1292,6 @@ class do not care about the implementation details of the actual manifests they receive (i.e. tree or flat or lazily loaded, etc).""" def __init__(self, opener, repo): - self._repo = repo - usetreemanifest = False cachesize = 4 @@ -1300,7 +1338,7 @@ if node not in dirlog.nodemap: raise LookupError(node, dirlog.indexfile, _('no node')) - m = treemanifestctx(self._repo, dir, node) + m = treemanifestctx(self, dir, node) else: raise error.Abort( _("cannot ask for manifest directory '%s' in a flat " @@ -1311,9 +1349,9 @@ raise LookupError(node, self._revlog.indexfile, _('no node')) if self._treeinmem: - m = treemanifestctx(self._repo, '', node) + m = treemanifestctx(self, '', node) else: - m = manifestctx(self._repo, node) + m = manifestctx(self, node) if node != revlog.nullid: mancache = self._dirmancache.get(dir) @@ -1328,18 +1366,18 @@ self._revlog.clearcaches() class memmanifestctx(object): - def __init__(self, repo): - self._repo = repo + def __init__(self, manifestlog): + self._manifestlog = manifestlog self._manifestdict = manifestdict() def _revlog(self): - return self._repo.manifestlog._revlog + return self._manifestlog._revlog def new(self): - return memmanifestctx(self._repo) + return memmanifestctx(self._manifestlog) def copy(self): - memmf = memmanifestctx(self._repo) + memmf = memmanifestctx(self._manifestlog) memmf._manifestdict = self.read().copy() return memmf @@ -1354,8 +1392,8 @@ """A class representing a single revision of a manifest, including its contents, its parent revs, and its linkrev. """ - def __init__(self, repo, node): - self._repo = repo + def __init__(self, manifestlog, node): + self._manifestlog = manifestlog self._data = None self._node = node @@ -1368,16 +1406,16 @@ #self.linkrev = revlog.linkrev(rev) def _revlog(self): - return self._repo.manifestlog._revlog + return self._manifestlog._revlog def node(self): return self._node def new(self): - return memmanifestctx(self._repo) + return memmanifestctx(self._manifestlog) def copy(self): - memmf = memmanifestctx(self._repo) + memmf = memmanifestctx(self._manifestlog) memmf._manifestdict = self.read().copy() return memmf @@ -1386,13 +1424,13 @@ return self._revlog().parents(self._node) def read(self): - if not self._data: + if self._data is None: if self._node == revlog.nullid: self._data = manifestdict() else: rl = self._revlog() text = rl.revision(self._node) - arraytext = array.array('c', text) + arraytext = bytearray(text) rl._fulltextcache[self._node] = arraytext self._data = manifestdict(text) return self._data @@ -1422,7 +1460,7 @@ if revlog._usemanifestv2: # Need to perform a slow delta r0 = revlog.deltaparent(revlog.rev(self._node)) - m0 = self._repo.manifestlog[revlog.node(r0)].read() + m0 = self._manifestlog[revlog.node(r0)].read() m1 = self.read() md = manifestdict() for f, ((n0, fl0), (n1, fl1)) in m0.diff(m1).iteritems(): @@ -1440,19 +1478,19 @@ return self.read().find(key) class memtreemanifestctx(object): - def __init__(self, repo, dir=''): - self._repo = repo + def __init__(self, manifestlog, dir=''): + self._manifestlog = manifestlog self._dir = dir self._treemanifest = treemanifest() def _revlog(self): - return self._repo.manifestlog._revlog + return self._manifestlog._revlog def new(self, dir=''): - return memtreemanifestctx(self._repo, dir=dir) + return memtreemanifestctx(self._manifestlog, dir=dir) def copy(self): - memmf = memtreemanifestctx(self._repo, dir=self._dir) + memmf = memtreemanifestctx(self._manifestlog, dir=self._dir) memmf._treemanifest = self._treemanifest.copy() return memmf @@ -1461,13 +1499,13 @@ def write(self, transaction, link, p1, p2, added, removed): def readtree(dir, node): - return self._repo.manifestlog.get(dir, node).read() + return self._manifestlog.get(dir, node).read() return self._revlog().add(self._treemanifest, transaction, link, p1, p2, added, removed, readtree=readtree) class treemanifestctx(object): - def __init__(self, repo, dir, node): - self._repo = repo + def __init__(self, manifestlog, dir, node): + self._manifestlog = manifestlog self._dir = dir self._data = None @@ -1481,10 +1519,10 @@ #self.linkrev = revlog.linkrev(rev) def _revlog(self): - return self._repo.manifestlog._revlog.dirlog(self._dir) + return self._manifestlog._revlog.dirlog(self._dir) def read(self): - if not self._data: + if self._data is None: rl = self._revlog() if self._node == revlog.nullid: self._data = treemanifest() @@ -1495,14 +1533,13 @@ def readsubtree(dir, subm): # Set verify to False since we need to be able to create # subtrees for trees that don't exist on disk. - return self._repo.manifestlog.get(dir, subm, - verify=False).read() + return self._manifestlog.get(dir, subm, verify=False).read() m.read(gettext, readsubtree) m.setnode(self._node) self._data = m else: text = rl.revision(self._node) - arraytext = array.array('c', text) + arraytext = bytearray(text) rl.fulltextcache[self._node] = arraytext self._data = treemanifest(dir=self._dir, text=text) @@ -1512,10 +1549,10 @@ return self._node def new(self, dir=''): - return memtreemanifestctx(self._repo, dir=dir) + return memtreemanifestctx(self._manifestlog, dir=dir) def copy(self): - memmf = memtreemanifestctx(self._repo, dir=self._dir) + memmf = memtreemanifestctx(self._manifestlog, dir=self._dir) memmf._treemanifest = self.read().copy() return memmf @@ -1542,7 +1579,7 @@ else: # Need to perform a slow delta r0 = revlog.deltaparent(revlog.rev(self._node)) - m0 = self._repo.manifestlog.get(self._dir, revlog.node(r0)).read() + m0 = self._manifestlog.get(self._dir, revlog.node(r0)).read() m1 = self.read() md = treemanifest(dir=self._dir) for f, ((n0, fl0), (n1, fl1)) in m0.diff(m1).iteritems(): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/match.py --- a/mercurial/match.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/match.py Fri Mar 24 08:37:26 2017 -0700 @@ -85,7 +85,7 @@ return True class match(object): - def __init__(self, root, cwd, patterns, include=[], exclude=[], + def __init__(self, root, cwd, patterns, include=None, exclude=None, default='glob', exact=False, auditor=None, ctx=None, listsubrepos=False, warn=None, badfn=None): """build an object to match a set of file patterns @@ -104,7 +104,10 @@ a pattern is one of: 'glob:' - a glob relative to cwd 're:' - a regular expression - 'path:' - a path relative to repository root + 'path:' - a path relative to repository root, which is matched + recursively + 'rootfilesin:' - a path relative to repository root, which is + matched non-recursively (will not match subdirectories) 'relglob:' - an unrooted glob (*.c matches C files in all dirs) 'relpath:' - a path relative to cwd 'relre:' - a regexp that needn't match the start of a name @@ -114,6 +117,10 @@ the same directory '' - a pattern of the specified default type """ + if include is None: + include = [] + if exclude is None: + exclude = [] self._root = root self._cwd = cwd @@ -122,9 +129,12 @@ self._always = False self._pathrestricted = bool(include or exclude or patterns) self._warn = warn + + # roots are directories which are recursively included/excluded. self._includeroots = set() + self._excluderoots = set() + # dirs are directories which are non-recursively included. self._includedirs = set(['.']) - self._excluderoots = set() if badfn is not None: self.bad = badfn @@ -134,14 +144,20 @@ kindpats = self._normalize(include, 'glob', root, cwd, auditor) self.includepat, im = _buildmatch(ctx, kindpats, '(?:/|$)', listsubrepos, root) - self._includeroots.update(_roots(kindpats)) - self._includedirs.update(util.dirs(self._includeroots)) + roots, dirs = _rootsanddirs(kindpats) + self._includeroots.update(roots) + self._includedirs.update(dirs) matchfns.append(im) if exclude: kindpats = self._normalize(exclude, 'glob', root, cwd, auditor) self.excludepat, em = _buildmatch(ctx, kindpats, '(?:/|$)', listsubrepos, root) if not _anypats(kindpats): + # Only consider recursive excludes as such - if a non-recursive + # exclude is used, we must still recurse into the excluded + # directory, at least to find subdirectories. In such a case, + # the regex still won't match the non-recursively-excluded + # files. self._excluderoots.update(_roots(kindpats)) matchfns.append(lambda f: not em(f)) if exact: @@ -153,7 +169,7 @@ elif patterns: kindpats = self._normalize(patterns, default, root, cwd, auditor) if not _kindpatsalwaysmatch(kindpats): - self._files = _roots(kindpats) + self._files = _explicitfiles(kindpats) self._anypats = self._anypats or _anypats(kindpats) self.patternspat, pm = _buildmatch(ctx, kindpats, '$', listsubrepos, root) @@ -238,7 +254,7 @@ return 'all' if dir in self._excluderoots: return False - if (self._includeroots and + if ((self._includeroots or self._includedirs != set(['.'])) and '.' not in self._includeroots and dir not in self._includeroots and dir not in self._includedirs and @@ -286,7 +302,7 @@ for kind, pat in [_patsplit(p, default) for p in patterns]: if kind in ('glob', 'relpath'): pat = pathutil.canonpath(root, cwd, pat, auditor) - elif kind in ('relglob', 'path'): + elif kind in ('relglob', 'path', 'rootfilesin'): pat = util.normpath(pat) elif kind in ('listfile', 'listfile0'): try: @@ -419,7 +435,9 @@ # m.exact(file) must be based off of the actual user input, otherwise # inexact case matches are treated as exact, and not noted without -v. if self._files: - self._fileroots = set(_roots(self._kp)) + roots, dirs = _rootsanddirs(self._kp) + self._fileroots = set(roots) + self._fileroots.update(dirs) def _normalize(self, patterns, default, root, cwd, auditor): self._kp = super(icasefsmatcher, self)._normalize(patterns, default, @@ -447,7 +465,8 @@ if ':' in pattern: kind, pat = pattern.split(':', 1) if kind in ('re', 'glob', 'path', 'relglob', 'relpath', 'relre', - 'listfile', 'listfile0', 'set', 'include', 'subinclude'): + 'listfile', 'listfile0', 'set', 'include', 'subinclude', + 'rootfilesin'): return kind, pat return default, pattern @@ -476,9 +495,9 @@ group = 0 escape = util.re.escape def peek(): - return i < n and pat[i] + return i < n and pat[i:i + 1] while i < n: - c = pat[i] + c = pat[i:i + 1] i += 1 if c not in '*?[{},\\': res += escape(c) @@ -496,18 +515,18 @@ res += '.' elif c == '[': j = i - if j < n and pat[j] in '!]': + if j < n and pat[j:j + 1] in '!]': j += 1 - while j < n and pat[j] != ']': + while j < n and pat[j:j + 1] != ']': j += 1 if j >= n: res += '\\[' else: stuff = pat[i:j].replace('\\','\\\\') i = j + 1 - if stuff[0] == '!': + if stuff[0:1] == '!': stuff = '^' + stuff[1:] - elif stuff[0] == '^': + elif stuff[0:1] == '^': stuff = '\\' + stuff res = '%s[%s]' % (res, stuff) elif c == '{': @@ -540,6 +559,14 @@ if pat == '.': return '' return '^' + util.re.escape(pat) + '(?:/|$)' + if kind == 'rootfilesin': + if pat == '.': + escaped = '' + else: + # Pattern is a directory name. + escaped = util.re.escape(pat) + '/' + # Anything after the pattern must be a non-directory. + return '^' + escaped + '[^/]+$' if kind == 'relglob': return '(?:|.*/)' + _globre(pat) + globsuffix if kind == 'relpath': @@ -609,17 +636,16 @@ raise error.Abort(_("invalid pattern (%s): %s") % (k, p)) raise error.Abort(_("invalid pattern")) -def _roots(kindpats): - '''return roots and exact explicitly listed files from patterns +def _patternrootsanddirs(kindpats): + '''Returns roots and directories corresponding to each pattern. - >>> _roots([('glob', 'g/*', ''), ('glob', 'g', ''), ('glob', 'g*', '')]) - ['g', 'g', '.'] - >>> _roots([('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')]) - ['r', 'p/p', '.'] - >>> _roots([('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')]) - ['.', '.', '.'] + This calculates the roots and directories exactly matching the patterns and + returns a tuple of (roots, dirs) for each. It does not return other + directories which may also need to be considered, like the parent + directories. ''' r = [] + d = [] for kind, pat, source in kindpats: if kind == 'glob': # find the non-glob prefix root = [] @@ -630,13 +656,63 @@ r.append('/'.join(root) or '.') elif kind in ('relpath', 'path'): r.append(pat or '.') + elif kind in ('rootfilesin',): + d.append(pat or '.') else: # relglob, re, relre r.append('.') - return r + return r, d + +def _roots(kindpats): + '''Returns root directories to match recursively from the given patterns.''' + roots, dirs = _patternrootsanddirs(kindpats) + return roots + +def _rootsanddirs(kindpats): + '''Returns roots and exact directories from patterns. + + roots are directories to match recursively, whereas exact directories should + be matched non-recursively. The returned (roots, dirs) tuple will also + include directories that need to be implicitly considered as either, such as + parent directories. + + >>> _rootsanddirs(\ + [('glob', 'g/h/*', ''), ('glob', 'g/h', ''), ('glob', 'g*', '')]) + (['g/h', 'g/h', '.'], ['g']) + >>> _rootsanddirs(\ + [('rootfilesin', 'g/h', ''), ('rootfilesin', '', '')]) + ([], ['g/h', '.', 'g']) + >>> _rootsanddirs(\ + [('relpath', 'r', ''), ('path', 'p/p', ''), ('path', '', '')]) + (['r', 'p/p', '.'], ['p']) + >>> _rootsanddirs(\ + [('relglob', 'rg*', ''), ('re', 're/', ''), ('relre', 'rr', '')]) + (['.', '.', '.'], []) + ''' + r, d = _patternrootsanddirs(kindpats) + + # Append the parents as non-recursive/exact directories, since they must be + # scanned to get to either the roots or the other exact directories. + d.extend(util.dirs(d)) + d.extend(util.dirs(r)) + + return r, d + +def _explicitfiles(kindpats): + '''Returns the potential explicit filenames from the patterns. + + >>> _explicitfiles([('path', 'foo/bar', '')]) + ['foo/bar'] + >>> _explicitfiles([('rootfilesin', 'foo/bar', '')]) + [] + ''' + # Keep only the pattern kinds where one can specify filenames (vs only + # directory names). + filable = [kp for kp in kindpats if kp[0] not in ('rootfilesin',)] + return _roots(filable) def _anypats(kindpats): for kind, pat, source in kindpats: - if kind in ('glob', 're', 'relglob', 'relre', 'set'): + if kind in ('glob', 're', 'relglob', 'relre', 'set', 'rootfilesin'): return True _commentre = None @@ -668,12 +744,12 @@ syntax = 'relre:' patterns = [] - fp = open(filepath) + fp = open(filepath, 'rb') for lineno, line in enumerate(util.iterfile(fp), start=1): if "#" in line: global _commentre if not _commentre: - _commentre = util.re.compile(r'((?:^|[^\\])(?:\\\\)*)#.*') + _commentre = util.re.compile(br'((?:^|[^\\])(?:\\\\)*)#.*') # remove comments prefixed by an even number of escapes m = _commentre.search(line) if m: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/mdiff.py --- a/mercurial/mdiff.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/mdiff.py Fri Mar 24 08:37:26 2017 -0700 @@ -196,15 +196,23 @@ yield s1, '=' def unidiff(a, ad, b, bd, fn1, fn2, opts=defaultopts): + """Return a unified diff as a (headers, hunks) tuple. + + If the diff is not null, `headers` is a list with unified diff header + lines "--- " and "+++ " and `hunks` is a generator yielding + (hunkrange, hunklines) coming from _unidiff(). + Otherwise, `headers` and `hunks` are empty. + """ def datetag(date, fn=None): if not opts.git and not opts.nodates: - return '\t%s\n' % date + return '\t%s' % date if fn and ' ' in fn: - return '\t\n' - return '\n' + return '\t' + return '' + sentinel = [], () if not a and not b: - return "" + return sentinel if opts.noprefix: aprefix = bprefix = '' @@ -217,10 +225,17 @@ fn1 = util.pconvert(fn1) fn2 = util.pconvert(fn2) + def checknonewline(lines): + for text in lines: + if text[-1] != '\n': + text += "\n\ No newline at end of file\n" + yield text + if not opts.text and (util.binary(a) or util.binary(b)): if a and b and len(a) == len(b) and a == b: - return "" - l = ['Binary file %s has changed\n' % fn1] + return sentinel + headerlines = [] + hunks = (None, ['Binary file %s has changed\n' % fn1]), elif not a: b = splitnewlines(b) if a is None: @@ -228,8 +243,11 @@ else: l1 = "--- %s%s%s" % (aprefix, fn1, datetag(ad, fn1)) l2 = "+++ %s%s" % (bprefix + fn2, datetag(bd, fn2)) - l3 = "@@ -0,0 +1,%d @@\n" % len(b) - l = [l1, l2, l3] + ["+" + e for e in b] + headerlines = [l1, l2] + size = len(b) + hunkrange = (0, 0, 1, size) + hunklines = ["@@ -0,0 +1,%d @@\n" % size] + ["+" + e for e in b] + hunks = (hunkrange, checknonewline(hunklines)), elif not b: a = splitnewlines(a) l1 = "--- %s%s%s" % (aprefix, fn1, datetag(ad, fn1)) @@ -237,28 +255,42 @@ l2 = '+++ /dev/null%s' % datetag(epoch) else: l2 = "+++ %s%s%s" % (bprefix, fn2, datetag(bd, fn2)) - l3 = "@@ -1,%d +0,0 @@\n" % len(a) - l = [l1, l2, l3] + ["-" + e for e in a] + headerlines = [l1, l2] + size = len(a) + hunkrange = (1, size, 0, 0) + hunklines = ["@@ -1,%d +0,0 @@\n" % size] + ["-" + e for e in a] + hunks = (hunkrange, checknonewline(hunklines)), else: - al = splitnewlines(a) - bl = splitnewlines(b) - l = list(_unidiff(a, b, al, bl, opts=opts)) - if not l: - return "" + diffhunks = _unidiff(a, b, opts=opts) + try: + hunkrange, hunklines = next(diffhunks) + except StopIteration: + return sentinel - l.insert(0, "--- %s%s%s" % (aprefix, fn1, datetag(ad, fn1))) - l.insert(1, "+++ %s%s%s" % (bprefix, fn2, datetag(bd, fn2))) + headerlines = [ + "--- %s%s%s" % (aprefix, fn1, datetag(ad, fn1)), + "+++ %s%s%s" % (bprefix, fn2, datetag(bd, fn2)), + ] + def rewindhunks(): + yield hunkrange, checknonewline(hunklines) + for hr, hl in diffhunks: + yield hr, checknonewline(hl) - for ln in xrange(len(l)): - if l[ln][-1] != '\n': - l[ln] += "\n\ No newline at end of file\n" + hunks = rewindhunks() - return "".join(l) + return headerlines, hunks + +def _unidiff(t1, t2, opts=defaultopts): + """Yield hunks of a headerless unified diff from t1 and t2 texts. -# creates a headerless unified diff -# t1 and t2 are the text to be diffed -# l1 and l2 are the text broken up into lines -def _unidiff(t1, t2, l1, l2, opts=defaultopts): + Each hunk consists of a (hunkrange, hunklines) tuple where `hunkrange` is a + tuple (s1, l1, s2, l2) representing the range information of the hunk to + form the '@@ -s1,l1 +s2,l2 @@' header and `hunklines` is a list of lines + of the hunk combining said header followed by line additions and + deletions. + """ + l1 = splitnewlines(t1) + l2 = splitnewlines(t2) def contextend(l, len): ret = l + opts.context if ret > len: @@ -300,12 +332,13 @@ if blen: bstart += 1 - yield "@@ -%d,%d +%d,%d @@%s\n" % (astart, alen, - bstart, blen, func) - for x in delta: - yield x - for x in xrange(a2, aend): - yield ' ' + l1[x] + hunkrange = astart, alen, bstart, blen + hunklines = ( + ["@@ -%d,%d +%d,%d @@%s\n" % (hunkrange + (func,))] + + delta + + [' ' + l1[x] for x in xrange(a2, aend)] + ) + yield hunkrange, hunklines # bdiff.blocks gives us the matching sequences in the files. The loop # below finds the spaces between those matching sequences and translates diff -r ed5b25874d99 -r 4baf79a77afa mercurial/merge.py --- a/mercurial/merge.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/merge.py Fri Mar 24 08:37:26 2017 -0700 @@ -27,6 +27,7 @@ copies, error, filemerge, + match as matchmod, obsolete, pycompat, scmutil, @@ -123,7 +124,7 @@ self._mdstate = 's' else: self._mdstate = 'u' - shutil.rmtree(self._repo.join('merge'), True) + shutil.rmtree(self._repo.vfs.join('merge'), True) self._results = {} self._dirty = False @@ -818,11 +819,10 @@ if any(wctx.sub(s).dirty() for s in wctx.substate): m1['.hgsubstate'] = modifiednodeid - # Compare manifests - if matcher is not None: - m1 = m1.matches(matcher) - m2 = m2.matches(matcher) - diff = m1.diff(m2) + diff = m1.diff(m2, match=matcher) + + if matcher is None: + matcher = matchmod.always('', '') actions = {} for f, ((n1, fl1), (n2, fl2)) in diff.iteritems(): @@ -927,7 +927,7 @@ # new file added in a directory that was moved df = dirmove[d] + f[len(d):] break - if df in m1: + if df is not None and df in m1: actions[df] = ('m', (df, f, f, False, pa.node()), "local directory rename - respect move from " + f) elif acceptremote: @@ -1060,8 +1060,7 @@ yields tuples for progress updates """ verbose = repo.ui.verbose - unlink = util.unlinkpath - wjoin = repo.wjoin + unlinkpath = repo.wvfs.unlinkpath audit = repo.wvfs.audit try: cwd = pycompat.getcwd() @@ -1076,7 +1075,7 @@ repo.ui.note(_("removing %s\n") % f) audit(f) try: - unlink(wjoin(f), ignoremissing=True) + unlinkpath(f, ignoremissing=True) except OSError as inst: repo.ui.warn(_("update failed to remove %s: %s!\n") % (f, inst.strerror)) @@ -1190,7 +1189,7 @@ if os.path.lexists(repo.wjoin(f)): repo.ui.debug("removing %s\n" % f) audit(f) - util.unlinkpath(repo.wjoin(f)) + repo.wvfs.unlinkpath(f) numupdates = sum(len(l) for m, l in actions.items() if m != 'k') @@ -1247,7 +1246,7 @@ repo.ui.note(_("moving %s to %s\n") % (f0, f)) audit(f) repo.wwrite(f, wctx.filectx(f0).data(), flags) - util.unlinkpath(repo.wjoin(f0)) + repo.wvfs.unlinkpath(f0) updated += 1 # local directory rename, get @@ -1444,11 +1443,12 @@ repo.dirstate.normal(f) def update(repo, node, branchmerge, force, ancestor=None, - mergeancestor=False, labels=None, matcher=None, mergeforce=False): + mergeancestor=False, labels=None, matcher=None, mergeforce=False, + updatecheck=None): """ Perform a merge between the working directory and the given node - node = the node to update to, or None if unspecified + node = the node to update to branchmerge = whether to merge between branches force = whether to force branch merging or file overwriting matcher = a matcher to filter file lists (dirstate not updated) @@ -1464,34 +1464,47 @@ The table below shows all the behaviors of the update command given the -c and -C or no options, whether the working directory is dirty, whether a revision is specified, and the relationship of - the parent rev to the target rev (linear, on the same named - branch, or on another named branch). + the parent rev to the target rev (linear or not). Match from top first. The + -n option doesn't exist on the command line, but represents the + experimental.updatecheck=noconflict option. This logic is tested by test-update-branches.t. - -c -C dirty rev | linear same cross - n n n n | ok (1) x - n n n y | ok ok ok - n n y n | merge (2) (2) - n n y y | merge (3) (3) - n y * * | discard discard discard - y n y * | (4) (4) (4) - y n n * | ok ok ok - y y * * | (5) (5) (5) + -c -C -n -m dirty rev linear | result + y y * * * * * | (1) + y * y * * * * | (1) + y * * y * * * | (1) + * y y * * * * | (1) + * y * y * * * | (1) + * * y y * * * | (1) + * * * * * n n | x + * * * * n * * | ok + n n n n y * y | merge + n n n n y y n | (2) + n n n y y * * | merge + n n y n y * * | merge if no conflict + n y n n y * * | discard + y n n n y * * | (3) x = can't happen * = don't-care - 1 = abort: not a linear update (merge or update --check to force update) - 2 = abort: uncommitted changes (commit and merge, or update --clean to - discard changes) - 3 = abort: uncommitted changes (commit or update --clean to discard changes) - 4 = abort: uncommitted changes (checked in commands.py) - 5 = incompatible options (checked in commands.py) + 1 = incompatible options (checked in commands.py) + 2 = abort: uncommitted changes (commit or update --clean to discard changes) + 3 = abort: uncommitted changes (checked in commands.py) Return the same tuple as applyupdates(). """ - onode = node + # This function used to find the default destination if node was None, but + # that's now in destutil.py. + assert node is not None + if not branchmerge and not force: + # TODO: remove the default once all callers that pass branchmerge=False + # and force=False pass a value for updatecheck. We may want to allow + # updatecheck='abort' to better suppport some of these callers. + if updatecheck is None: + updatecheck = 'linear' + assert updatecheck in ('none', 'linear', 'noconflict') # If we're doing a partial update, we need to skip updating # the dirstate, so make a note of any partial-ness to the # update here. @@ -1531,7 +1544,7 @@ raise error.Abort(_("merging with a working directory ancestor" " has no effect")) elif pas == [p1]: - if not mergeancestor and p1.branch() == p2.branch(): + if not mergeancestor and wc.branch() == p2.branch(): raise error.Abort(_("nothing to merge"), hint=_("use 'hg update' " "or check 'hg heads'")) @@ -1548,39 +1561,33 @@ repo.hook('update', parent1=xp2, parent2='', error=0) return 0, 0, 0, 0 - if pas not in ([p1], [p2]): # nonlinear + if (updatecheck == 'linear' and + pas not in ([p1], [p2])): # nonlinear dirty = wc.dirty(missing=True) - if dirty or onode is None: + if dirty: # Branching is a bit strange to ensure we do the minimal - # amount of call to obsolete.background. + # amount of call to obsolete.foreground. foreground = obsolete.foreground(repo, [p1.node()]) # note: the variable contains a random identifier if repo[node].node() in foreground: - pas = [p1] # allow updating to successors - elif dirty: + pass # allow updating to successors + else: msg = _("uncommitted changes") - if onode is None: - hint = _("commit and merge, or update --clean to" - " discard changes") - else: - hint = _("commit or update --clean to discard" - " changes") - raise error.Abort(msg, hint=hint) - else: # node is none - msg = _("not a linear update") - hint = _("merge or update --check to force update") - raise error.Abort(msg, hint=hint) + hint = _("commit or update --clean to discard changes") + raise error.UpdateAbort(msg, hint=hint) else: # Allow jumping branches if clean and specific rev given - pas = [p1] + pass + + if overwrite: + pas = [wc] + elif not branchmerge: + pas = [p1] # deprecated config: merge.followcopies followcopies = repo.ui.configbool('merge', 'followcopies', True) if overwrite: - pas = [wc] followcopies = False - elif pas == [p2]: # backwards - pas = [p1] elif not pas[0]: followcopies = False if not branchmerge and not wc.dirty(missing=True): @@ -1591,6 +1598,13 @@ repo, wc, p2, pas, branchmerge, force, mergeancestor, followcopies, matcher=matcher, mergeforce=mergeforce) + if updatecheck == 'noconflict': + for f, (m, args, msg) in actionbyfile.iteritems(): + if m not in ('g', 'k', 'e', 'r'): + msg = _("conflicting changes") + hint = _("commit or update --clean to discard changes") + raise error.Abort(msg, hint=hint) + # Prompt and create actions. Most of this is in the resolve phase # already, but we can't handle .hgsubstate in filemerge or # subrepo.submerge yet so we have to keep prompting for it. @@ -1664,7 +1678,7 @@ repo.setparents(fp1, fp2) recordupdates(repo, actions, branchmerge) # update completed, clear state - util.unlink(repo.join('updatestate')) + util.unlink(repo.vfs.join('updatestate')) if not branchmerge: repo.dirstate.setbranch(p2.branch()) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/minirst.py --- a/mercurial/minirst.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/minirst.py Fri Mar 24 08:37:26 2017 -0700 @@ -26,6 +26,7 @@ from .i18n import _ from . import ( encoding, + pycompat, util, ) @@ -59,12 +60,12 @@ # ASCII characters other than control/alphabet/digit as a part of # multi-bytes characters, so direct replacing with such characters # on strings in local encoding causes invalid byte sequences. - utext = text.decode(encoding.encoding) + utext = text.decode(pycompat.sysstr(encoding.encoding)) for f, t in substs: utext = utext.replace(f.decode("ascii"), t.decode("ascii")) - return utext.encode(encoding.encoding) + return utext.encode(pycompat.sysstr(encoding.encoding)) -_blockre = re.compile(r"\n(?:\s*\n)+") +_blockre = re.compile(br"\n(?:\s*\n)+") def findblocks(text): """Find continuous blocks of lines in text. @@ -138,12 +139,12 @@ i += 1 return blocks -_bulletre = re.compile(r'(-|[0-9A-Za-z]+\.|\(?[0-9A-Za-z]+\)|\|) ') -_optionre = re.compile(r'^(-([a-zA-Z0-9]), )?(--[a-z0-9-]+)' - r'((.*) +)(.*)$') -_fieldre = re.compile(r':(?![: ])([^:]*)(?%s\n
%s\n' % (term, text)) elif btype == 'bullet': bullet, head = lines[0].split(' ', 1) - if bullet == '-': + if bullet in ('*', '-'): openlist('ul', level) else: openlist('ol', level) @@ -629,7 +645,7 @@ return ''.join(out) -def parse(text, indent=0, keep=None): +def parse(text, indent=0, keep=None, admonitions=None): """Parse text into a list of blocks""" pruned = [] blocks = findblocks(text) @@ -644,7 +660,7 @@ blocks = splitparagraphs(blocks) blocks = updatefieldlists(blocks) blocks = updateoptionlists(blocks) - blocks = findadmonitions(blocks) + blocks = findadmonitions(blocks, admonitions=admonitions) blocks = addmargins(blocks) blocks = prunecomments(blocks) return blocks, pruned diff -r ed5b25874d99 -r 4baf79a77afa mercurial/obsolete.py --- a/mercurial/obsolete.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/obsolete.py Fri Mar 24 08:37:26 2017 -0700 @@ -552,6 +552,8 @@ pass return bool(self._all) + __bool__ = __nonzero__ + @property def readonly(self): """True if marker creation is disabled @@ -1120,7 +1122,7 @@ """the set of obsolete revisions""" obs = set() getnode = repo.changelog.node - notpublic = repo.revs("not public()") + notpublic = repo._phasecache.getrevset(repo, (phases.draft, phases.secret)) for r in notpublic: if getnode(r) in repo.obsstore.successors: obs.add(r) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/osutil.c --- a/mercurial/osutil.c Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/osutil.c Fri Mar 24 08:37:26 2017 -0700 @@ -24,6 +24,16 @@ #include #include #include +#ifdef HAVE_LINUX_MAGIC_H +#include +#endif +#ifdef HAVE_BSD_STATFS +#include +#include +#endif +#ifdef HAVE_SYS_VFS_H +#include +#endif #endif #ifdef __APPLE__ @@ -206,7 +216,7 @@ char *pattern; /* build the path + \* pattern string */ - pattern = malloc(plen + 3); /* path + \* + \0 */ + pattern = PyMem_Malloc(plen + 3); /* path + \* + \0 */ if (!pattern) { PyErr_NoMemory(); goto error_nomem; @@ -269,7 +279,7 @@ error_list: FindClose(fh); error_file: - free(pattern); + PyMem_Free(pattern); error_nomem: return rval; } @@ -400,6 +410,8 @@ Py_XDECREF(stat); error_list: closedir(dir); + /* closedir also closes its dirfd */ + goto error_value; error_dir: #ifdef AT_SYMLINK_NOFOLLOW close(dfd); @@ -784,6 +796,312 @@ } #endif /* ndef SETPROCNAME_USE_NONE */ +#ifdef HAVE_STATFS +/* given a directory path, return filesystem type (best-effort), or None */ +const char *getfstype(const char *path) { +#ifdef HAVE_BSD_STATFS + /* need to return a string field */ + static struct statfs buf; +#else + struct statfs buf; +#endif + int r; + memset(&buf, 0, sizeof(buf)); + r = statfs(path, &buf); + if (r != 0) + return NULL; +#ifdef HAVE_BSD_STATFS + /* BSD or OSX provides a f_fstypename field */ + return buf.f_fstypename; +#endif + /* Begin of Linux filesystems */ +#ifdef ADFS_SUPER_MAGIC + if (buf.f_type == ADFS_SUPER_MAGIC) + return "adfs"; +#endif +#ifdef AFFS_SUPER_MAGIC + if (buf.f_type == AFFS_SUPER_MAGIC) + return "affs"; +#endif +#ifdef BDEVFS_MAGIC + if (buf.f_type == BDEVFS_MAGIC) + return "bdevfs"; +#endif +#ifdef BEFS_SUPER_MAGIC + if (buf.f_type == BEFS_SUPER_MAGIC) + return "befs"; +#endif +#ifdef BFS_MAGIC + if (buf.f_type == BFS_MAGIC) + return "bfs"; +#endif +#ifdef BINFMTFS_MAGIC + if (buf.f_type == BINFMTFS_MAGIC) + return "binfmtfs"; +#endif +#ifdef BTRFS_SUPER_MAGIC + if (buf.f_type == BTRFS_SUPER_MAGIC) + return "btrfs"; +#endif +#ifdef CGROUP_SUPER_MAGIC + if (buf.f_type == CGROUP_SUPER_MAGIC) + return "cgroup"; +#endif +#ifdef CIFS_MAGIC_NUMBER + if (buf.f_type == CIFS_MAGIC_NUMBER) + return "cifs"; +#endif +#ifdef CODA_SUPER_MAGIC + if (buf.f_type == CODA_SUPER_MAGIC) + return "coda"; +#endif +#ifdef COH_SUPER_MAGIC + if (buf.f_type == COH_SUPER_MAGIC) + return "coh"; +#endif +#ifdef CRAMFS_MAGIC + if (buf.f_type == CRAMFS_MAGIC) + return "cramfs"; +#endif +#ifdef DEBUGFS_MAGIC + if (buf.f_type == DEBUGFS_MAGIC) + return "debugfs"; +#endif +#ifdef DEVFS_SUPER_MAGIC + if (buf.f_type == DEVFS_SUPER_MAGIC) + return "devfs"; +#endif +#ifdef DEVPTS_SUPER_MAGIC + if (buf.f_type == DEVPTS_SUPER_MAGIC) + return "devpts"; +#endif +#ifdef EFIVARFS_MAGIC + if (buf.f_type == EFIVARFS_MAGIC) + return "efivarfs"; +#endif +#ifdef EFS_SUPER_MAGIC + if (buf.f_type == EFS_SUPER_MAGIC) + return "efs"; +#endif +#ifdef EXT_SUPER_MAGIC + if (buf.f_type == EXT_SUPER_MAGIC) + return "ext"; +#endif +#ifdef EXT2_OLD_SUPER_MAGIC + if (buf.f_type == EXT2_OLD_SUPER_MAGIC) + return "ext2"; +#endif +#ifdef EXT2_SUPER_MAGIC + if (buf.f_type == EXT2_SUPER_MAGIC) + return "ext2"; +#endif +#ifdef EXT3_SUPER_MAGIC + if (buf.f_type == EXT3_SUPER_MAGIC) + return "ext3"; +#endif +#ifdef EXT4_SUPER_MAGIC + if (buf.f_type == EXT4_SUPER_MAGIC) + return "ext4"; +#endif +#ifdef FUSE_SUPER_MAGIC + if (buf.f_type == FUSE_SUPER_MAGIC) + return "fuse"; +#endif +#ifdef FUTEXFS_SUPER_MAGIC + if (buf.f_type == FUTEXFS_SUPER_MAGIC) + return "futexfs"; +#endif +#ifdef HFS_SUPER_MAGIC + if (buf.f_type == HFS_SUPER_MAGIC) + return "hfs"; +#endif +#ifdef HOSTFS_SUPER_MAGIC + if (buf.f_type == HOSTFS_SUPER_MAGIC) + return "hostfs"; +#endif +#ifdef HPFS_SUPER_MAGIC + if (buf.f_type == HPFS_SUPER_MAGIC) + return "hpfs"; +#endif +#ifdef HUGETLBFS_MAGIC + if (buf.f_type == HUGETLBFS_MAGIC) + return "hugetlbfs"; +#endif +#ifdef ISOFS_SUPER_MAGIC + if (buf.f_type == ISOFS_SUPER_MAGIC) + return "isofs"; +#endif +#ifdef JFFS2_SUPER_MAGIC + if (buf.f_type == JFFS2_SUPER_MAGIC) + return "jffs2"; +#endif +#ifdef JFS_SUPER_MAGIC + if (buf.f_type == JFS_SUPER_MAGIC) + return "jfs"; +#endif +#ifdef MINIX_SUPER_MAGIC + if (buf.f_type == MINIX_SUPER_MAGIC) + return "minix"; +#endif +#ifdef MINIX2_SUPER_MAGIC + if (buf.f_type == MINIX2_SUPER_MAGIC) + return "minix2"; +#endif +#ifdef MINIX3_SUPER_MAGIC + if (buf.f_type == MINIX3_SUPER_MAGIC) + return "minix3"; +#endif +#ifdef MQUEUE_MAGIC + if (buf.f_type == MQUEUE_MAGIC) + return "mqueue"; +#endif +#ifdef MSDOS_SUPER_MAGIC + if (buf.f_type == MSDOS_SUPER_MAGIC) + return "msdos"; +#endif +#ifdef NCP_SUPER_MAGIC + if (buf.f_type == NCP_SUPER_MAGIC) + return "ncp"; +#endif +#ifdef NFS_SUPER_MAGIC + if (buf.f_type == NFS_SUPER_MAGIC) + return "nfs"; +#endif +#ifdef NILFS_SUPER_MAGIC + if (buf.f_type == NILFS_SUPER_MAGIC) + return "nilfs"; +#endif +#ifdef NTFS_SB_MAGIC + if (buf.f_type == NTFS_SB_MAGIC) + return "ntfs-sb"; +#endif +#ifdef OCFS2_SUPER_MAGIC + if (buf.f_type == OCFS2_SUPER_MAGIC) + return "ocfs2"; +#endif +#ifdef OPENPROM_SUPER_MAGIC + if (buf.f_type == OPENPROM_SUPER_MAGIC) + return "openprom"; +#endif +#ifdef PIPEFS_MAGIC + if (buf.f_type == PIPEFS_MAGIC) + return "pipefs"; +#endif +#ifdef PROC_SUPER_MAGIC + if (buf.f_type == PROC_SUPER_MAGIC) + return "proc"; +#endif +#ifdef PSTOREFS_MAGIC + if (buf.f_type == PSTOREFS_MAGIC) + return "pstorefs"; +#endif +#ifdef QNX4_SUPER_MAGIC + if (buf.f_type == QNX4_SUPER_MAGIC) + return "qnx4"; +#endif +#ifdef QNX6_SUPER_MAGIC + if (buf.f_type == QNX6_SUPER_MAGIC) + return "qnx6"; +#endif +#ifdef RAMFS_MAGIC + if (buf.f_type == RAMFS_MAGIC) + return "ramfs"; +#endif +#ifdef REISERFS_SUPER_MAGIC + if (buf.f_type == REISERFS_SUPER_MAGIC) + return "reiserfs"; +#endif +#ifdef ROMFS_MAGIC + if (buf.f_type == ROMFS_MAGIC) + return "romfs"; +#endif +#ifdef SELINUX_MAGIC + if (buf.f_type == SELINUX_MAGIC) + return "selinux"; +#endif +#ifdef SMACK_MAGIC + if (buf.f_type == SMACK_MAGIC) + return "smack"; +#endif +#ifdef SMB_SUPER_MAGIC + if (buf.f_type == SMB_SUPER_MAGIC) + return "smb"; +#endif +#ifdef SOCKFS_MAGIC + if (buf.f_type == SOCKFS_MAGIC) + return "sockfs"; +#endif +#ifdef SQUASHFS_MAGIC + if (buf.f_type == SQUASHFS_MAGIC) + return "squashfs"; +#endif +#ifdef SYSFS_MAGIC + if (buf.f_type == SYSFS_MAGIC) + return "sysfs"; +#endif +#ifdef SYSV2_SUPER_MAGIC + if (buf.f_type == SYSV2_SUPER_MAGIC) + return "sysv2"; +#endif +#ifdef SYSV4_SUPER_MAGIC + if (buf.f_type == SYSV4_SUPER_MAGIC) + return "sysv4"; +#endif +#ifdef TMPFS_MAGIC + if (buf.f_type == TMPFS_MAGIC) + return "tmpfs"; +#endif +#ifdef UDF_SUPER_MAGIC + if (buf.f_type == UDF_SUPER_MAGIC) + return "udf"; +#endif +#ifdef UFS_MAGIC + if (buf.f_type == UFS_MAGIC) + return "ufs"; +#endif +#ifdef USBDEVICE_SUPER_MAGIC + if (buf.f_type == USBDEVICE_SUPER_MAGIC) + return "usbdevice"; +#endif +#ifdef V9FS_MAGIC + if (buf.f_type == V9FS_MAGIC) + return "v9fs"; +#endif +#ifdef VXFS_SUPER_MAGIC + if (buf.f_type == VXFS_SUPER_MAGIC) + return "vxfs"; +#endif +#ifdef XENFS_SUPER_MAGIC + if (buf.f_type == XENFS_SUPER_MAGIC) + return "xenfs"; +#endif +#ifdef XENIX_SUPER_MAGIC + if (buf.f_type == XENIX_SUPER_MAGIC) + return "xenix"; +#endif +#ifdef XFS_SUPER_MAGIC + if (buf.f_type == XFS_SUPER_MAGIC) + return "xfs"; +#endif + /* End of Linux filesystems */ + return NULL; +} + +static PyObject *pygetfstype(PyObject *self, PyObject *args) +{ + const char *path = NULL; + if (!PyArg_ParseTuple(args, "s", &path)) + return NULL; + + const char *type = getfstype(path); + if (type == NULL) + Py_RETURN_NONE; + + PyObject *result = Py_BuildValue("s", type); + return result; +} +#endif /* def HAVE_STATFS */ + #endif /* ndef _WIN32 */ static PyObject *listdir(PyObject *self, PyObject *args, PyObject *kwargs) @@ -960,6 +1278,10 @@ {"setprocname", (PyCFunction)setprocname, METH_VARARGS, "set process title (best-effort)\n"}, #endif +#ifdef HAVE_STATFS + {"getfstype", (PyCFunction)pygetfstype, METH_VARARGS, + "get filesystem type (best-effort)\n"}, +#endif #endif /* ndef _WIN32 */ #ifdef __APPLE__ { diff -r ed5b25874d99 -r 4baf79a77afa mercurial/parser.py --- a/mercurial/parser.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/parser.py Fri Mar 24 08:37:26 2017 -0700 @@ -19,7 +19,10 @@ from __future__ import absolute_import from .i18n import _ -from . import error +from . import ( + error, + util, +) class parser(object): def __init__(self, elements, methods=None): @@ -164,7 +167,7 @@ def unescapestr(s): try: - return s.decode("string_escape") + return util.unescapestr(s) except ValueError as e: # mangle Python's exception into our format raise error.ParseError(str(e).lower()) @@ -265,7 +268,7 @@ """Compose error message from specified ParseError object """ if len(inst.args) > 1: - return _('at %s: %s') % (inst.args[1], inst.args[0]) + return _('at %d: %s') % (inst.args[1], inst.args[0]) else: return inst.args[0] diff -r ed5b25874d99 -r 4baf79a77afa mercurial/parsers.c --- a/mercurial/parsers.c Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/parsers.c Fri Mar 24 08:37:26 2017 -0700 @@ -560,11 +560,11 @@ } /* - * Build a set of non-normal entries from the dirstate dmap + * Build a set of non-normal and other parent entries from the dirstate dmap */ -static PyObject *nonnormalentries(PyObject *self, PyObject *args) -{ - PyObject *dmap, *nonnset = NULL, *fname, *v; +static PyObject *nonnormalotherparententries(PyObject *self, PyObject *args) { + PyObject *dmap, *fname, *v; + PyObject *nonnset = NULL, *otherpset = NULL, *result = NULL; Py_ssize_t pos; if (!PyArg_ParseTuple(args, "O!:nonnormalentries", @@ -575,6 +575,10 @@ if (nonnset == NULL) goto bail; + otherpset = PySet_New(NULL); + if (otherpset == NULL) + goto bail; + pos = 0; while (PyDict_Next(dmap, &pos, &fname, &v)) { dirstateTupleObject *t; @@ -585,15 +589,28 @@ } t = (dirstateTupleObject *)v; + if (t->state == 'n' && t->size == -2) { + if (PySet_Add(otherpset, fname) == -1) { + goto bail; + } + } + if (t->state == 'n' && t->mtime != -1) continue; if (PySet_Add(nonnset, fname) == -1) goto bail; } - return nonnset; + result = Py_BuildValue("(OO)", nonnset, otherpset); + if (result == NULL) + goto bail; + Py_DECREF(nonnset); + Py_DECREF(otherpset); + return result; bail: Py_XDECREF(nonnset); + Py_XDECREF(otherpset); + Py_XDECREF(result); return NULL; } @@ -800,8 +817,8 @@ { if (self->inlined && pos > 0) { if (self->offsets == NULL) { - self->offsets = malloc(self->raw_length * - sizeof(*self->offsets)); + self->offsets = PyMem_Malloc(self->raw_length * + sizeof(*self->offsets)); if (self->offsets == NULL) return (const char *)PyErr_NoMemory(); inline_scan(self, self->offsets); @@ -1014,7 +1031,7 @@ self->cache = NULL; } if (self->offsets) { - free(self->offsets); + PyMem_Free(self->offsets); self->offsets = NULL; } if (self->nt) { @@ -2149,7 +2166,7 @@ int *revs; argcount = PySequence_Length(args); - revs = malloc(argcount * sizeof(*revs)); + revs = PyMem_Malloc(argcount * sizeof(*revs)); if (argcount > 0 && revs == NULL) return PyErr_NoMemory(); len = index_length(self) - 1; @@ -2220,11 +2237,11 @@ goto bail; done: - free(revs); + PyMem_Free(revs); return ret; bail: - free(revs); + PyMem_Free(revs); Py_XDECREF(ret); return NULL; } @@ -2722,6 +2739,7 @@ data += nparents * hashwidth; } else { parents = Py_None; + Py_INCREF(parents); } if (data + 2 * nmetadata > dataend) { @@ -2764,8 +2782,7 @@ Py_XDECREF(prec); Py_XDECREF(succs); Py_XDECREF(metadata); - if (parents != Py_None) - Py_XDECREF(parents); + Py_XDECREF(parents); return ret; } @@ -2814,8 +2831,9 @@ static PyMethodDef methods[] = { {"pack_dirstate", pack_dirstate, METH_VARARGS, "pack a dirstate\n"}, - {"nonnormalentries", nonnormalentries, METH_VARARGS, - "create a set containing non-normal entries of given dirstate\n"}, + {"nonnormalotherparententries", nonnormalotherparententries, METH_VARARGS, + "create a set containing non-normal and other parent entries of given " + "dirstate\n"}, {"parse_manifest", parse_manifest, METH_VARARGS, "parse a manifest\n"}, {"parse_dirstate", parse_dirstate, METH_VARARGS, "parse a dirstate\n"}, {"parse_index2", parse_index2, METH_VARARGS, "parse a revlog index\n"}, diff -r ed5b25874d99 -r 4baf79a77afa mercurial/patch.py --- a/mercurial/patch.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/patch.py Fri Mar 24 08:37:26 2017 -0700 @@ -34,9 +34,11 @@ mail, mdiff, pathutil, + pycompat, scmutil, similar, util, + vfs as vfsmod, ) stringio = util.stringio @@ -209,7 +211,7 @@ data = {} fd, tmpname = tempfile.mkstemp(prefix='hg-patch-') - tmpfp = os.fdopen(fd, 'w') + tmpfp = os.fdopen(fd, pycompat.sysstr('w')) try: msg = email.Parser.Parser().parse(fileobj) @@ -448,7 +450,7 @@ class fsbackend(abstractbackend): def __init__(self, ui, basedir): super(fsbackend, self).__init__(ui) - self.opener = scmutil.opener(basedir) + self.opener = vfsmod.vfs(basedir) def _join(self, f): return os.path.join(self.opener.base, f) @@ -559,7 +561,7 @@ else: if self.opener is None: root = tempfile.mkdtemp(prefix='hg-patch-') - self.opener = scmutil.opener(root) + self.opener = vfsmod.vfs(root) # Avoid filename issues with these simple names fn = str(self.created) self.opener.write(fn, data) @@ -1055,7 +1057,7 @@ ncpatchfp = None try: # Write the initial patch - f = os.fdopen(patchfd, "w") + f = os.fdopen(patchfd, pycompat.sysstr("w")) chunk.header.write(f) chunk.write(f) f.write('\n'.join(['# ' + i for i in phelp.splitlines()])) @@ -1063,7 +1065,8 @@ # Start the editor and wait for it to complete editor = ui.geteditor() ret = ui.system("%s \"%s\"" % (editor, patchfn), - environ={'HGUSER': ui.username()}) + environ={'HGUSER': ui.username()}, + blockedtag='filterpatch') if ret != 0: ui.warn(_("editor exited with exit code %d\n") % ret) continue @@ -2212,8 +2215,8 @@ return mdiff.diffopts(**buildopts) -def diff(repo, node1=None, node2=None, match=None, changes=None, opts=None, - losedatafn=None, prefix='', relroot='', copy=None): +def diff(repo, node1=None, node2=None, match=None, changes=None, + opts=None, losedatafn=None, prefix='', relroot='', copy=None): '''yields diff of changes to files between two nodes, or node and working directory. @@ -2236,6 +2239,24 @@ copy, if not empty, should contain mappings {dst@y: src@x} of copy information.''' + for header, hunks in diffhunks(repo, node1=node1, node2=node2, match=match, + changes=changes, opts=opts, + losedatafn=losedatafn, prefix=prefix, + relroot=relroot, copy=copy): + text = ''.join(sum((list(hlines) for hrange, hlines in hunks), [])) + if header and (text or len(header) > 1): + yield '\n'.join(header) + '\n' + if text: + yield text + +def diffhunks(repo, node1=None, node2=None, match=None, changes=None, + opts=None, losedatafn=None, prefix='', relroot='', copy=None): + """Yield diff of changes to files in the form of (`header`, `hunks`) tuples + where `header` is a list of diff headers and `hunks` is an iterable of + (`hunkrange`, `hunklines`) tuples. + + See diff() for the meaning of parameters. + """ if opts is None: opts = mdiff.defaultopts @@ -2536,6 +2557,7 @@ if text: header.append('index %s..%s' % (gitindex(content1), gitindex(content2))) + hunks = (None, [text]), else: if opts.git and opts.index > 0: flag = flag1 @@ -2546,13 +2568,11 @@ gitindex(content2)[0:opts.index], gitmode[flag])) - text = mdiff.unidiff(content1, date1, - content2, date2, - path1, path2, opts=opts) - if header and (text or len(header) > 1): - yield '\n'.join(header) + '\n' - if text: - yield text + uheaders, hunks = mdiff.unidiff(content1, date1, + content2, date2, + path1, path2, opts=opts) + header.extend(uheaders) + yield header, hunks def diffstatsum(stats): maxfile, maxtotal, addtotal, removetotal, binary = 0, 0, 0, 0, False diff -r ed5b25874d99 -r 4baf79a77afa mercurial/phases.py --- a/mercurial/phases.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/phases.py Fri Mar 24 08:37:26 2017 -0700 @@ -113,8 +113,9 @@ short, ) from . import ( - encoding, error, + smartset, + txnutil, ) allphases = public, draft, secret = range(3) @@ -136,15 +137,7 @@ dirty = False roots = [set() for i in allphases] try: - f = None - if 'HG_PENDING' in encoding.environ: - try: - f = repo.svfs('phaseroots.pending') - except IOError as inst: - if inst.errno != errno.ENOENT: - raise - if f is None: - f = repo.svfs('phaseroots') + f, pending = txnutil.trypending(repo.root, repo.svfs, 'phaseroots') try: for line in f: phase, nh = line.split() @@ -170,6 +163,27 @@ self.filterunknown(repo) self.opener = repo.svfs + def getrevset(self, repo, phases): + """return a smartset for the given phases""" + self.loadphaserevs(repo) # ensure phase's sets are loaded + + if self._phasesets and all(self._phasesets[p] is not None + for p in phases): + # fast path - use _phasesets + revs = self._phasesets[phases[0]] + if len(phases) > 1: + revs = revs.copy() # only copy when needed + for p in phases[1:]: + revs.update(self._phasesets[p]) + if repo.changelog.filteredrevs: + revs = revs - repo.changelog.filteredrevs + return smartset.baseset(revs) + else: + # slow path - enumerate all revisions + phase = self.phase + revs = (r for r in repo if phase(repo, r) in phases) + return smartset.generatorset(revs, iterasc=True) + def copy(self): # Shallow copy meant to ensure isolation in # advance/retractboundary(), nothing more. @@ -199,7 +213,7 @@ self._phaserevs = revs self._populatephaseroots(repo) for phase in trackedphases: - roots = map(repo.changelog.rev, self.phaseroots[phase]) + roots = list(map(repo.changelog.rev, self.phaseroots[phase])) if roots: for rev in roots: revs[rev] = phase @@ -210,12 +224,8 @@ """ensure phase information is loaded in the object""" if self._phaserevs is None: try: - if repo.ui.configbool('experimental', - 'nativephaseskillswitch'): - self._computephaserevspure(repo) - else: - res = self._getphaserevsnative(repo) - self._phaserevs, self._phasesets = res + res = self._getphaserevsnative(repo) + self._phaserevs, self._phasesets = res except AttributeError: self._computephaserevspure(repo) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/policy.py --- a/mercurial/policy.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/policy.py Fri Mar 24 08:37:26 2017 -0700 @@ -19,9 +19,9 @@ # py - only load pure Python modules # # By default, require the C extensions for performance reasons. -policy = 'c' -policynoc = ('cffi', 'cffi-allow', 'py') -policynocffi = ('c', 'py') +policy = b'c' +policynoc = (b'cffi', b'cffi-allow', b'py') +policynocffi = (b'c', b'py') try: from . import __modulepolicy__ @@ -39,7 +39,11 @@ # Our C extensions aren't yet compatible with Python 3. So use pure Python # on Python 3 for now. if sys.version_info[0] >= 3: - policy = 'py' + policy = b'py' # Environment variable can always force settings. -policy = os.environ.get('HGMODULEPOLICY', policy) +if sys.version_info[0] >= 3: + if 'HGMODULEPOLICY' in os.environ: + policy = os.environ['HGMODULEPOLICY'].encode('utf-8') +else: + policy = os.environ.get('HGMODULEPOLICY', policy) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/posix.py --- a/mercurial/posix.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/posix.py Fri Mar 24 08:37:26 2017 -0700 @@ -105,7 +105,7 @@ fp = open(f) data = fp.read() fp.close() - os.unlink(f) + unlink(f) try: os.symlink(data, f) except OSError: @@ -118,7 +118,7 @@ if stat.S_ISLNK(s): # switch link to file data = os.readlink(f) - os.unlink(f) + unlink(f) fp = open(f, "w") fp.write(data) fp.close() @@ -181,15 +181,15 @@ except OSError as e: if e.errno != errno.ENOENT: raise - file(checknoexec, 'w').close() # might fail + open(checknoexec, 'w').close() # might fail m = os.stat(checknoexec).st_mode if m & EXECFLAGS == 0: # check-exec is exec and check-no-exec is not exec return True # checknoexec exists but is exec - delete it - os.unlink(checknoexec) + unlink(checknoexec) # checkisexec exists but is not exec - delete it - os.unlink(checkisexec) + unlink(checkisexec) # check using one file, leave it as checkisexec checkdir = cachedir @@ -210,7 +210,7 @@ return True finally: if fn is not None: - os.unlink(fn) + unlink(fn) except (IOError, OSError): # we don't care, the user probably won't be able to commit anyway return False @@ -230,13 +230,16 @@ else: checkdir = path cachedir = None - name = tempfile.mktemp(dir=checkdir, prefix='checklink-') + fscheckdir = pycompat.fsdecode(checkdir) + name = tempfile.mktemp(dir=fscheckdir, + prefix=r'checklink-') + name = pycompat.fsencode(name) try: fd = None if cachedir is None: - fd = tempfile.NamedTemporaryFile(dir=checkdir, - prefix='hg-checklink-') - target = os.path.basename(fd.name) + fd = tempfile.NamedTemporaryFile(dir=fscheckdir, + prefix=r'hg-checklink-') + target = pycompat.fsencode(os.path.basename(fd.name)) else: # create a fixed file to link to; doesn't matter if it # already exists. @@ -245,12 +248,12 @@ try: os.symlink(target, name) if cachedir is None: - os.unlink(name) + unlink(name) else: try: os.rename(name, checklink) except OSError: - os.unlink(name) + unlink(name) return True except OSError as inst: # link creation might race, try again @@ -265,7 +268,7 @@ except OSError as inst: # sshfs might report failure while successfully creating the link if inst[0] == errno.EIO and os.path.exists(name): - os.unlink(name) + unlink(name) return False def checkosfilename(path): @@ -408,7 +411,7 @@ return '"%s"' % s global _needsshellquote if _needsshellquote is None: - _needsshellquote = re.compile(r'[^a-zA-Z0-9._/+-]').search + _needsshellquote = re.compile(br'[^a-zA-Z0-9._/+-]').search if s and not _needsshellquote(s): # "s" shouldn't have to be quoted return s @@ -533,19 +536,6 @@ def makedir(path, notindexed): os.mkdir(path) -def unlinkpath(f, ignoremissing=False): - """unlink and remove the directory if it is empty""" - try: - os.unlink(f) - except OSError as e: - if not (ignoremissing and e.errno == errno.ENOENT): - raise - # try removing directories that might now be empty - try: - os.removedirs(os.path.dirname(f)) - except OSError: - pass - def lookupreg(key, name=None, scope=None): return None diff -r ed5b25874d99 -r 4baf79a77afa mercurial/profiling.py --- a/mercurial/profiling.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/profiling.py Fri Mar 24 08:37:26 2017 -0700 @@ -8,7 +8,6 @@ from __future__ import absolute_import, print_function import contextlib -import time from .i18n import _ from . import ( @@ -66,7 +65,7 @@ collapse_recursion = True thread = flamegraph.ProfileThread(fp, 1.0 / freq, filter_, collapse_recursion) - start_time = time.clock() + start_time = util.timer() try: thread.start() yield @@ -74,7 +73,7 @@ thread.stop() thread.join() print('Collected %d stack frames (%d unique) in %2.2f seconds.' % ( - time.clock() - start_time, thread.num_frames(), + util.timer() - start_time, thread.num_frames(), thread.num_frames(unique=True))) @contextlib.contextmanager @@ -103,6 +102,7 @@ 'bymethod': statprof.DisplayFormats.ByMethod, 'hotpath': statprof.DisplayFormats.Hotpath, 'json': statprof.DisplayFormats.Json, + 'chrome': statprof.DisplayFormats.Chrome, } if profformat in formats: @@ -111,7 +111,23 @@ ui.warn(_('unknown profiler output format: %s\n') % profformat) displayformat = statprof.DisplayFormats.Hotpath - statprof.display(fp, data=data, format=displayformat) + kwargs = {} + + def fraction(s): + if s.endswith('%'): + v = float(s[:-1]) / 100 + else: + v = float(s) + if 0 <= v <= 1: + return v + raise ValueError(s) + + if profformat == 'chrome': + showmin = ui.configwith(fraction, 'profiling', 'showmin', 0.005) + showmax = ui.configwith(fraction, 'profiling', 'showmax', 0.999) + kwargs.update(minthreshold=showmin, maxthreshold=showmax) + + statprof.display(fp, data=data, format=displayformat, **kwargs) @contextlib.contextmanager def profile(ui): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/pure/osutil.py --- a/mercurial/pure/osutil.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/pure/osutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -338,7 +338,7 @@ _kernel32.CloseHandle(fh) _raiseioerror(name) - f = os.fdopen(fd, mode, bufsize) + f = os.fdopen(fd, pycompat.sysstr(mode), bufsize) # unfortunately, f.name is '' at this point -- so we store # the name on this wrapper. We cannot just assign to f.name, # because that attribute is read-only. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/pure/parsers.py --- a/mercurial/pure/parsers.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/pure/parsers.py Fri Mar 24 08:37:26 2017 -0700 @@ -14,6 +14,7 @@ from . import pycompat stringio = pycompat.stringio + _pack = struct.pack _unpack = struct.unpack _compress = zlib.compress @@ -34,7 +35,7 @@ return int(q & 0xFFFF) def offset_type(offset, type): - return long(long(offset) << 16 | type) + return int(int(offset) << 16 | type) class BaseIndexObject(object): def __len__(self): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/pycompat.py --- a/mercurial/pycompat.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/pycompat.py Fri Mar 24 08:37:26 2017 -0700 @@ -19,26 +19,23 @@ if not ispy3: import cPickle as pickle - import cStringIO as io import httplib import Queue as _queue import SocketServer as socketserver - import urlparse - urlunquote = urlparse.unquote import xmlrpclib else: import http.client as httplib - import io import pickle import queue as _queue import socketserver - import urllib.parse as urlparse - urlunquote = urlparse.unquote_to_bytes import xmlrpc.client as xmlrpclib if ispy3: import builtins import functools + import io + import struct + fsencode = os.fsencode fsdecode = os.fsdecode # A bytes version of os.name. @@ -55,6 +52,8 @@ sysexecutable = sys.executable if sysexecutable: sysexecutable = os.fsencode(sysexecutable) + stringio = io.BytesIO + maplist = lambda *args: list(map(*args)) # TODO: .buffer might not exist if std streams were replaced; we'll need # a silly wrapper to make a bytes stream backed by a unicode one. @@ -72,6 +71,73 @@ if getattr(sys, 'argv', None) is not None: sysargv = list(map(os.fsencode, sys.argv)) + bytechr = struct.Struct('>B').pack + + class bytestr(bytes): + """A bytes which mostly acts as a Python 2 str + + >>> bytestr(), bytestr(bytearray(b'foo')), bytestr(u'ascii'), bytestr(1) + (b'', b'foo', b'ascii', b'1') + >>> s = bytestr(b'foo') + >>> assert s is bytestr(s) + + There's no implicit conversion from non-ascii str as its encoding is + unknown: + + >>> bytestr(chr(0x80)) # doctest: +ELLIPSIS + Traceback (most recent call last): + ... + UnicodeEncodeError: ... + + Comparison between bytestr and bytes should work: + + >>> assert bytestr(b'foo') == b'foo' + >>> assert b'foo' == bytestr(b'foo') + >>> assert b'f' in bytestr(b'foo') + >>> assert bytestr(b'f') in b'foo' + + Sliced elements should be bytes, not integer: + + >>> s[1], s[:2] + (b'o', b'fo') + >>> list(s), list(reversed(s)) + ([b'f', b'o', b'o'], [b'o', b'o', b'f']) + + As bytestr type isn't propagated across operations, you need to cast + bytes to bytestr explicitly: + + >>> s = bytestr(b'foo').upper() + >>> t = bytestr(s) + >>> s[0], t[0] + (70, b'F') + + Be careful to not pass a bytestr object to a function which expects + bytearray-like behavior. + + >>> t = bytes(t) # cast to bytes + >>> assert type(t) is bytes + """ + + def __new__(cls, s=b''): + if isinstance(s, bytestr): + return s + if not isinstance(s, (bytes, bytearray)): + s = str(s).encode(u'ascii') + return bytes.__new__(cls, s) + + def __getitem__(self, key): + s = bytes.__getitem__(self, key) + if not isinstance(s, bytes): + s = bytechr(s) + return s + + def __iter__(self): + return iterbytestr(bytes.__iter__(self)) + + def iterbytestr(s): + """Iterate bytes as if it were a str object of Python 2""" + return map(bytechr, s) + def sysstr(s): """Return a keyword str to be passed to Python functions such as getattr() and str.encode() @@ -97,6 +163,9 @@ setattr = _wrapattrfunc(builtins.setattr) xrange = builtins.range + def open(name, mode='r', buffering=-1): + return builtins.open(name, sysstr(mode), buffering) + # getopt.getopt() on Python 3 deals with unicodes internally so we cannot # pass bytes there. Passing unicodes will result in unicodes as return # values which we need to convert again to bytes. @@ -132,6 +201,12 @@ return [a.encode('latin-1') for a in ret] else: + import cStringIO + + bytechr = chr + bytestr = str + iterbytestr = iter + def sysstr(s): return s @@ -172,8 +247,9 @@ getcwd = os.getcwd sysexecutable = sys.executable shlexsplit = shlex.split + stringio = cStringIO.StringIO + maplist = map -stringio = io.StringIO empty = _queue.Empty queue = _queue.Queue @@ -188,6 +264,10 @@ (item.replace(sysstr('_'), sysstr('')).lower(), (origin, item)) for item in items) + def _registeralias(self, origin, attr, name): + """Alias ``origin``.``attr`` as ``name``""" + self._aliases[sysstr(name)] = (origin, sysstr(attr)) + def __getattr__(self, name): try: origin, item = self._aliases[name] @@ -205,6 +285,7 @@ import SimpleHTTPServer import urllib2 import urllib + import urlparse urlreq._registeraliases(urllib, ( "addclosehook", "addinfourl", @@ -235,6 +316,10 @@ "Request", "urlopen", )) + urlreq._registeraliases(urlparse, ( + "urlparse", + "urlunparse", + )) urlerr._registeraliases(urllib2, ( "HTTPError", "URLError", @@ -251,11 +336,19 @@ )) else: + import urllib.parse + urlreq._registeraliases(urllib.parse, ( + "splitattr", + "splitpasswd", + "splitport", + "splituser", + "urlparse", + "urlunparse", + )) + urlreq._registeralias(urllib.parse, "unquote_to_bytes", "unquote") import urllib.request urlreq._registeraliases(urllib.request, ( "AbstractHTTPHandler", - "addclosehook", - "addinfourl", "BaseHandler", "build_opener", "FileHandler", @@ -269,16 +362,15 @@ "HTTPDigestAuthHandler", "HTTPPasswordMgrWithDefaultRealm", "ProxyHandler", - "quote", "Request", - "splitattr", - "splitpasswd", - "splitport", - "splituser", - "unquote", "url2pathname", "urlopen", )) + import urllib.response + urlreq._registeraliases(urllib.response, ( + "addclosehook", + "addinfourl", + )) import urllib.error urlerr._registeraliases(urllib.error, ( "HTTPError", @@ -291,3 +383,12 @@ "SimpleHTTPRequestHandler", "CGIHTTPRequestHandler", )) + + # urllib.parse.quote() accepts both str and bytes, decodes bytes + # (if necessary), and returns str. This is wonky. We provide a custom + # implementation that only accepts bytes and emits bytes. + def quote(s, safe=r'/'): + s = urllib.parse.quote_from_bytes(s, safe=safe) + return s.encode('ascii', 'strict') + + urlreq.quote = quote diff -r ed5b25874d99 -r 4baf79a77afa mercurial/repair.py --- a/mercurial/repair.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/repair.py Fri Mar 24 08:37:26 2017 -0700 @@ -12,7 +12,6 @@ import hashlib import stat import tempfile -import time from .i18n import _ from .node import short @@ -27,6 +26,7 @@ revlog, scmutil, util, + vfs as vfsmod, ) def _bundle(repo, bases, heads, node, suffix, compress=True): @@ -883,10 +883,10 @@ ui.write(_('data fully migrated to temporary repository\n')) backuppath = tempfile.mkdtemp(prefix='upgradebackup.', dir=srcrepo.path) - backupvfs = scmutil.vfs(backuppath) + backupvfs = vfsmod.vfs(backuppath) # Make a backup of requires file first, as it is the first to be modified. - util.copyfile(srcrepo.join('requires'), backupvfs.join('requires')) + util.copyfile(srcrepo.vfs.join('requires'), backupvfs.join('requires')) # We install an arbitrary requirement that clients must not support # as a mechanism to lock out new clients during the data swap. This is @@ -905,10 +905,10 @@ # the operation nearly instantaneous and atomic (at least in well-behaved # environments). ui.write(_('replacing store...\n')) - tstart = time.time() + tstart = util.timer() util.rename(srcrepo.spath, backupvfs.join('store')) util.rename(dstrepo.spath, srcrepo.spath) - elapsed = time.time() - tstart + elapsed = util.timer() - tstart ui.write(_('store replacement complete; repository was inconsistent for ' '%0.1fs\n') % elapsed) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/repoview.py --- a/mercurial/repoview.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/repoview.py Fri Mar 24 08:37:26 2017 -0700 @@ -104,7 +104,7 @@ """ h = hashlib.sha1() h.update(''.join(repo.heads())) - h.update(str(hash(frozenset(hideable)))) + h.update('%d' % hash(frozenset(hideable))) return h.digest() def _writehiddencache(cachefile, cachehash, hidden): @@ -139,15 +139,13 @@ if wlock: wlock.release() -def tryreadcache(repo, hideable): - """read a cache if the cache exists and is valid, otherwise returns None.""" +def _readhiddencache(repo, cachefilename, newhash): hidden = fh = None try: if repo.vfs.exists(cachefile): fh = repo.vfs.open(cachefile, 'rb') version, = struct.unpack(">H", fh.read(2)) oldhash = fh.read(20) - newhash = cachehash(repo, hideable) if (cacheversion, oldhash) == (version, newhash): # cache is valid, so we can start reading the hidden revs data = fh.read() @@ -165,6 +163,11 @@ if fh: fh.close() +def tryreadcache(repo, hideable): + """read a cache if the cache exists and is valid, otherwise returns None.""" + newhash = cachehash(repo, hideable) + return _readhiddencache(repo, cachefile, newhash) + def computehidden(repo): """compute the set of hidden revision to filter @@ -297,10 +300,10 @@ """ def __init__(self, repo, filtername): - object.__setattr__(self, '_unfilteredrepo', repo) - object.__setattr__(self, 'filtername', filtername) - object.__setattr__(self, '_clcachekey', None) - object.__setattr__(self, '_clcache', None) + object.__setattr__(self, r'_unfilteredrepo', repo) + object.__setattr__(self, r'filtername', filtername) + object.__setattr__(self, r'_clcachekey', None) + object.__setattr__(self, r'_clcache', None) # not a propertycache on purpose we shall implement a proper cache later @property @@ -328,8 +331,8 @@ if cl is None: cl = copy.copy(unfichangelog) cl.filteredrevs = revs - object.__setattr__(self, '_clcache', cl) - object.__setattr__(self, '_clcachekey', newkey) + object.__setattr__(self, r'_clcache', cl) + object.__setattr__(self, r'_clcachekey', newkey) return cl def unfiltered(self): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/revlog.py --- a/mercurial/revlog.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/revlog.py Fri Mar 24 08:37:26 2017 -0700 @@ -33,6 +33,7 @@ error, mdiff, parsers, + pycompat, templatefilters, util, ) @@ -117,7 +118,7 @@ def offset_type(offset, type): if (type & ~REVIDX_KNOWN_FLAGS) != 0: raise ValueError('unknown revlog index flags') - return long(long(offset) << 16 | type) + return int(int(offset) << 16 | type) _nullhash = hashlib.sha1(nullid) @@ -943,7 +944,7 @@ ancs = self.index.commonancestorsheads(a, b) except (AttributeError, OverflowError): # C implementation failed ancs = ancestor.commonancestorsheads(self.parentrevs, a, b) - return map(self.node, ancs) + return pycompat.maplist(self.node, ancs) def isancestor(self, a, b): """return True if node a is an ancestor of node b @@ -1234,7 +1235,7 @@ def revdiff(self, rev1, rev2): """return or calculate a delta between two revisions""" if rev1 != nullrev and self.deltaparent(rev2) == rev1: - return str(self._chunk(rev2)) + return bytes(self._chunk(rev2)) return mdiff.textdiff(self.revision(rev1), self.revision(rev2)) @@ -1277,7 +1278,7 @@ bins = self._chunks(chain, df=_df) if text is None: - text = str(bins[0]) + text = bytes(bins[0]) bins = bins[1:] text = mdiff.patches(text, bins) @@ -1521,7 +1522,7 @@ # # According to `hg perfrevlogchunks`, this is ~0.5% faster for zlib # compressed chunks. And this matters for changelog and manifest reads. - t = data[0] + t = data[0:1] if t == 'x': try: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/revset.py --- a/mercurial/revset.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/revset.py Fri Mar 24 08:37:26 2017 -0700 @@ -9,7 +9,6 @@ import heapq import re -import string from .i18n import _ from . import ( @@ -20,15 +19,34 @@ match as matchmod, node, obsolete as obsmod, - parser, pathutil, phases, - pycompat, registrar, repoview, + revsetlang, + smartset, util, ) +# helpers for processing parsed tree +getsymbol = revsetlang.getsymbol +getstring = revsetlang.getstring +getinteger = revsetlang.getinteger +getlist = revsetlang.getlist +getrange = revsetlang.getrange +getargs = revsetlang.getargs +getargsdict = revsetlang.getargsdict + +# constants used as an argument of match() and matchany() +anyorder = revsetlang.anyorder +defineorder = revsetlang.defineorder +followorder = revsetlang.followorder + +baseset = smartset.baseset +generatorset = smartset.generatorset +spanset = smartset.spanset +fullreposet = smartset.fullreposet + def _revancestors(repo, revs, followfirst): """Like revlog.ancestors(), but supports followfirst.""" if followfirst: @@ -146,213 +164,8 @@ revs.sort() return revs -elements = { - # token-type: binding-strength, primary, prefix, infix, suffix - "(": (21, None, ("group", 1, ")"), ("func", 1, ")"), None), - "##": (20, None, None, ("_concat", 20), None), - "~": (18, None, None, ("ancestor", 18), None), - "^": (18, None, None, ("parent", 18), "parentpost"), - "-": (5, None, ("negate", 19), ("minus", 5), None), - "::": (17, None, ("dagrangepre", 17), ("dagrange", 17), "dagrangepost"), - "..": (17, None, ("dagrangepre", 17), ("dagrange", 17), "dagrangepost"), - ":": (15, "rangeall", ("rangepre", 15), ("range", 15), "rangepost"), - "not": (10, None, ("not", 10), None, None), - "!": (10, None, ("not", 10), None, None), - "and": (5, None, None, ("and", 5), None), - "&": (5, None, None, ("and", 5), None), - "%": (5, None, None, ("only", 5), "onlypost"), - "or": (4, None, None, ("or", 4), None), - "|": (4, None, None, ("or", 4), None), - "+": (4, None, None, ("or", 4), None), - "=": (3, None, None, ("keyvalue", 3), None), - ",": (2, None, None, ("list", 2), None), - ")": (0, None, None, None, None), - "symbol": (0, "symbol", None, None, None), - "string": (0, "string", None, None, None), - "end": (0, None, None, None, None), -} - -keywords = set(['and', 'or', 'not']) - -# default set of valid characters for the initial letter of symbols -_syminitletters = set( - string.ascii_letters + - string.digits + pycompat.sysstr('._@')) | set(map(chr, xrange(128, 256))) - -# default set of valid characters for non-initial letters of symbols -_symletters = _syminitletters | set(pycompat.sysstr('-/')) - -def tokenize(program, lookup=None, syminitletters=None, symletters=None): - ''' - Parse a revset statement into a stream of tokens - - ``syminitletters`` is the set of valid characters for the initial - letter of symbols. - - By default, character ``c`` is recognized as valid for initial - letter of symbols, if ``c.isalnum() or c in '._@' or ord(c) > 127``. - - ``symletters`` is the set of valid characters for non-initial - letters of symbols. - - By default, character ``c`` is recognized as valid for non-initial - letters of symbols, if ``c.isalnum() or c in '-._/@' or ord(c) > 127``. - - Check that @ is a valid unquoted token character (issue3686): - >>> list(tokenize("@::")) - [('symbol', '@', 0), ('::', None, 1), ('end', None, 3)] - - ''' - if syminitletters is None: - syminitletters = _syminitletters - if symletters is None: - symletters = _symletters - - if program and lookup: - # attempt to parse old-style ranges first to deal with - # things like old-tag which contain query metacharacters - parts = program.split(':', 1) - if all(lookup(sym) for sym in parts if sym): - if parts[0]: - yield ('symbol', parts[0], 0) - if len(parts) > 1: - s = len(parts[0]) - yield (':', None, s) - if parts[1]: - yield ('symbol', parts[1], s + 1) - yield ('end', None, len(program)) - return - - pos, l = 0, len(program) - while pos < l: - c = program[pos] - if c.isspace(): # skip inter-token whitespace - pass - elif c == ':' and program[pos:pos + 2] == '::': # look ahead carefully - yield ('::', None, pos) - pos += 1 # skip ahead - elif c == '.' and program[pos:pos + 2] == '..': # look ahead carefully - yield ('..', None, pos) - pos += 1 # skip ahead - elif c == '#' and program[pos:pos + 2] == '##': # look ahead carefully - yield ('##', None, pos) - pos += 1 # skip ahead - elif c in "():=,-|&+!~^%": # handle simple operators - yield (c, None, pos) - elif (c in '"\'' or c == 'r' and - program[pos:pos + 2] in ("r'", 'r"')): # handle quoted strings - if c == 'r': - pos += 1 - c = program[pos] - decode = lambda x: x - else: - decode = parser.unescapestr - pos += 1 - s = pos - while pos < l: # find closing quote - d = program[pos] - if d == '\\': # skip over escaped characters - pos += 2 - continue - if d == c: - yield ('string', decode(program[s:pos]), s) - break - pos += 1 - else: - raise error.ParseError(_("unterminated string"), s) - # gather up a symbol/keyword - elif c in syminitletters: - s = pos - pos += 1 - while pos < l: # find end of symbol - d = program[pos] - if d not in symletters: - break - if d == '.' and program[pos - 1] == '.': # special case for .. - pos -= 1 - break - pos += 1 - sym = program[s:pos] - if sym in keywords: # operator keywords - yield (sym, None, s) - elif '-' in sym: - # some jerk gave us foo-bar-baz, try to check if it's a symbol - if lookup and lookup(sym): - # looks like a real symbol - yield ('symbol', sym, s) - else: - # looks like an expression - parts = sym.split('-') - for p in parts[:-1]: - if p: # possible consecutive - - yield ('symbol', p, s) - s += len(p) - yield ('-', None, pos) - s += 1 - if parts[-1]: # possible trailing - - yield ('symbol', parts[-1], s) - else: - yield ('symbol', sym, s) - pos -= 1 - else: - raise error.ParseError(_("syntax error in revset '%s'") % - program, pos) - pos += 1 - yield ('end', None, pos) - # helpers -_notset = object() - -def getsymbol(x): - if x and x[0] == 'symbol': - return x[1] - raise error.ParseError(_('not a symbol')) - -def getstring(x, err): - if x and (x[0] == 'string' or x[0] == 'symbol'): - return x[1] - raise error.ParseError(err) - -def getinteger(x, err, default=_notset): - if not x and default is not _notset: - return default - try: - return int(getstring(x, err)) - except ValueError: - raise error.ParseError(err) - -def getlist(x): - if not x: - return [] - if x[0] == 'list': - return list(x[1:]) - return [x] - -def getrange(x, err): - if not x: - raise error.ParseError(err) - op = x[0] - if op == 'range': - return x[1], x[2] - elif op == 'rangepre': - return None, x[1] - elif op == 'rangepost': - return x[1], None - elif op == 'rangeall': - return None, None - raise error.ParseError(err) - -def getargs(x, min, max, err): - l = getlist(x) - if len(l) < min or (max >= 0 and len(l) > max): - raise error.ParseError(err) - return l - -def getargsdict(x, funcname, keys): - return parser.buildargsdict(getlist(x), funcname, parser.splitargspec(keys), - keyvaluenode='keyvalue', keynode='symbol') - def getset(repo, subset, x): if not x: raise error.ParseError(_("missing argument")) @@ -501,7 +314,7 @@ @predicate('_destupdate') def _destupdate(repo, subset, x): # experimental revset for update destination - args = getargsdict(x, 'limit', 'clean check') + args = getargsdict(x, 'limit', 'clean') return subset & baseset([destutil.destupdate(repo, **args)[0]]) @predicate('_destmerge') @@ -1139,7 +952,8 @@ fromline -= 1 fctx = repo[rev].filectx(fname) - revs = (c.rev() for c in context.blockancestors(fctx, fromline, toline)) + revs = (c.rev() for c, _linerange + in context.blockancestors(fctx, fromline, toline)) return subset & generatorset(revs, iterasc=False) @predicate('all()', safe=True) @@ -1638,19 +1452,10 @@ ps -= set([node.nullrev]) return subset & ps -def _phase(repo, subset, target): - """helper to select all rev in phase """ - repo._phasecache.loadphaserevs(repo) # ensure phase's sets are loaded - if repo._phasecache._phasesets: - s = repo._phasecache._phasesets[target] - repo.changelog.filteredrevs - s = baseset(s) - s.sort() # set are non ordered, so we enforce ascending - return subset & s - else: - phase = repo._phasecache.phase - condition = lambda r: phase(repo, r) == target - return subset.filter(condition, condrepr=('', target), - cache=False) +def _phase(repo, subset, *targets): + """helper to select all rev in phases""" + s = repo._phasecache.getrevset(repo, targets) + return subset & s @predicate('draft()', safe=True) def draft(repo, subset, x): @@ -1711,20 +1516,7 @@ @predicate('_notpublic', safe=True) def _notpublic(repo, subset, x): getargs(x, 0, 0, "_notpublic takes no arguments") - repo._phasecache.loadphaserevs(repo) # ensure phase's sets are loaded - if repo._phasecache._phasesets: - s = set() - for u in repo._phasecache._phasesets[1:]: - s.update(u) - s = baseset(s - repo.changelog.filteredrevs) - s.sort() - return subset & s - else: - phase = repo._phasecache.phase - target = phases.public - condition = lambda r: phase(repo, r) != target - return subset.filter(condition, condrepr=('', target), - cache=False) + return _phase(repo, subset, phases.draft, phases.secret) @predicate('public()', safe=True) def public(repo, subset, x): @@ -2428,350 +2220,6 @@ "parentpost": parentpost, } -# Constants for ordering requirement, used in _analyze(): -# -# If 'define', any nested functions and operations can change the ordering of -# the entries in the set. If 'follow', any nested functions and operations -# should take the ordering specified by the first operand to the '&' operator. -# -# For instance, -# -# X & (Y | Z) -# ^ ^^^^^^^ -# | follow -# define -# -# will be evaluated as 'or(y(x()), z(x()))', where 'x()' can change the order -# of the entries in the set, but 'y()', 'z()' and 'or()' shouldn't. -# -# 'any' means the order doesn't matter. For instance, -# -# X & !Y -# ^ -# any -# -# 'y()' can either enforce its ordering requirement or take the ordering -# specified by 'x()' because 'not()' doesn't care the order. -# -# Transition of ordering requirement: -# -# 1. starts with 'define' -# 2. shifts to 'follow' by 'x & y' -# 3. changes back to 'define' on function call 'f(x)' or function-like -# operation 'x (f) y' because 'f' may have its own ordering requirement -# for 'x' and 'y' (e.g. 'first(x)') -# -anyorder = 'any' # don't care the order -defineorder = 'define' # should define the order -followorder = 'follow' # must follow the current order - -# transition table for 'x & y', from the current expression 'x' to 'y' -_tofolloworder = { - anyorder: anyorder, - defineorder: followorder, - followorder: followorder, -} - -def _matchonly(revs, bases): - """ - >>> f = lambda *args: _matchonly(*map(parse, args)) - >>> f('ancestors(A)', 'not ancestors(B)') - ('list', ('symbol', 'A'), ('symbol', 'B')) - """ - if (revs is not None - and revs[0] == 'func' - and getsymbol(revs[1]) == 'ancestors' - and bases is not None - and bases[0] == 'not' - and bases[1][0] == 'func' - and getsymbol(bases[1][1]) == 'ancestors'): - return ('list', revs[2], bases[1][2]) - -def _fixops(x): - """Rewrite raw parsed tree to resolve ambiguous syntax which cannot be - handled well by our simple top-down parser""" - if not isinstance(x, tuple): - return x - - op = x[0] - if op == 'parent': - # x^:y means (x^) : y, not x ^ (:y) - # x^: means (x^) :, not x ^ (:) - post = ('parentpost', x[1]) - if x[2][0] == 'dagrangepre': - return _fixops(('dagrange', post, x[2][1])) - elif x[2][0] == 'rangepre': - return _fixops(('range', post, x[2][1])) - elif x[2][0] == 'rangeall': - return _fixops(('rangepost', post)) - elif op == 'or': - # make number of arguments deterministic: - # x + y + z -> (or x y z) -> (or (list x y z)) - return (op, _fixops(('list',) + x[1:])) - - return (op,) + tuple(_fixops(y) for y in x[1:]) - -def _analyze(x, order): - if x is None: - return x - - op = x[0] - if op == 'minus': - return _analyze(('and', x[1], ('not', x[2])), order) - elif op == 'only': - t = ('func', ('symbol', 'only'), ('list', x[1], x[2])) - return _analyze(t, order) - elif op == 'onlypost': - return _analyze(('func', ('symbol', 'only'), x[1]), order) - elif op == 'dagrangepre': - return _analyze(('func', ('symbol', 'ancestors'), x[1]), order) - elif op == 'dagrangepost': - return _analyze(('func', ('symbol', 'descendants'), x[1]), order) - elif op == 'negate': - s = getstring(x[1], _("can't negate that")) - return _analyze(('string', '-' + s), order) - elif op in ('string', 'symbol'): - return x - elif op == 'and': - ta = _analyze(x[1], order) - tb = _analyze(x[2], _tofolloworder[order]) - return (op, ta, tb, order) - elif op == 'or': - return (op, _analyze(x[1], order), order) - elif op == 'not': - return (op, _analyze(x[1], anyorder), order) - elif op == 'rangeall': - return (op, None, order) - elif op in ('rangepre', 'rangepost', 'parentpost'): - return (op, _analyze(x[1], defineorder), order) - elif op == 'group': - return _analyze(x[1], order) - elif op in ('dagrange', 'range', 'parent', 'ancestor'): - ta = _analyze(x[1], defineorder) - tb = _analyze(x[2], defineorder) - return (op, ta, tb, order) - elif op == 'list': - return (op,) + tuple(_analyze(y, order) for y in x[1:]) - elif op == 'keyvalue': - return (op, x[1], _analyze(x[2], order)) - elif op == 'func': - f = getsymbol(x[1]) - d = defineorder - if f == 'present': - # 'present(set)' is known to return the argument set with no - # modification, so forward the current order to its argument - d = order - return (op, x[1], _analyze(x[2], d), order) - raise ValueError('invalid operator %r' % op) - -def analyze(x, order=defineorder): - """Transform raw parsed tree to evaluatable tree which can be fed to - optimize() or getset() - - All pseudo operations should be mapped to real operations or functions - defined in methods or symbols table respectively. - - 'order' specifies how the current expression 'x' is ordered (see the - constants defined above.) - """ - return _analyze(x, order) - -def _optimize(x, small): - if x is None: - return 0, x - - smallbonus = 1 - if small: - smallbonus = .5 - - op = x[0] - if op in ('string', 'symbol'): - return smallbonus, x # single revisions are small - elif op == 'and': - wa, ta = _optimize(x[1], True) - wb, tb = _optimize(x[2], True) - order = x[3] - w = min(wa, wb) - - # (::x and not ::y)/(not ::y and ::x) have a fast path - tm = _matchonly(ta, tb) or _matchonly(tb, ta) - if tm: - return w, ('func', ('symbol', 'only'), tm, order) - - if tb is not None and tb[0] == 'not': - return wa, ('difference', ta, tb[1], order) - - if wa > wb: - return w, (op, tb, ta, order) - return w, (op, ta, tb, order) - elif op == 'or': - # fast path for machine-generated expression, that is likely to have - # lots of trivial revisions: 'a + b + c()' to '_list(a b) + c()' - order = x[2] - ws, ts, ss = [], [], [] - def flushss(): - if not ss: - return - if len(ss) == 1: - w, t = ss[0] - else: - s = '\0'.join(t[1] for w, t in ss) - y = ('func', ('symbol', '_list'), ('string', s), order) - w, t = _optimize(y, False) - ws.append(w) - ts.append(t) - del ss[:] - for y in getlist(x[1]): - w, t = _optimize(y, False) - if t is not None and (t[0] == 'string' or t[0] == 'symbol'): - ss.append((w, t)) - continue - flushss() - ws.append(w) - ts.append(t) - flushss() - if len(ts) == 1: - return ws[0], ts[0] # 'or' operation is fully optimized out - # we can't reorder trees by weight because it would change the order. - # ("sort(a + b)" == "sort(b + a)", but "a + b" != "b + a") - # ts = tuple(t for w, t in sorted(zip(ws, ts), key=lambda wt: wt[0])) - return max(ws), (op, ('list',) + tuple(ts), order) - elif op == 'not': - # Optimize not public() to _notpublic() because we have a fast version - if x[1][:3] == ('func', ('symbol', 'public'), None): - order = x[1][3] - newsym = ('func', ('symbol', '_notpublic'), None, order) - o = _optimize(newsym, not small) - return o[0], o[1] - else: - o = _optimize(x[1], not small) - order = x[2] - return o[0], (op, o[1], order) - elif op == 'rangeall': - return smallbonus, x - elif op in ('rangepre', 'rangepost', 'parentpost'): - o = _optimize(x[1], small) - order = x[2] - return o[0], (op, o[1], order) - elif op in ('dagrange', 'range', 'parent', 'ancestor'): - wa, ta = _optimize(x[1], small) - wb, tb = _optimize(x[2], small) - order = x[3] - return wa + wb, (op, ta, tb, order) - elif op == 'list': - ws, ts = zip(*(_optimize(y, small) for y in x[1:])) - return sum(ws), (op,) + ts - elif op == 'keyvalue': - w, t = _optimize(x[2], small) - return w, (op, x[1], t) - elif op == 'func': - f = getsymbol(x[1]) - wa, ta = _optimize(x[2], small) - if f in ('author', 'branch', 'closed', 'date', 'desc', 'file', 'grep', - 'keyword', 'outgoing', 'user', 'destination'): - w = 10 # slow - elif f in ('modifies', 'adds', 'removes'): - w = 30 # slower - elif f == "contains": - w = 100 # very slow - elif f == "ancestor": - w = 1 * smallbonus - elif f in ('reverse', 'limit', 'first', 'wdir', '_intlist'): - w = 0 - elif f == "sort": - w = 10 # assume most sorts look at changelog - else: - w = 1 - order = x[3] - return w + wa, (op, x[1], ta, order) - raise ValueError('invalid operator %r' % op) - -def optimize(tree): - """Optimize evaluatable tree - - All pseudo operations should be transformed beforehand. - """ - _weight, newtree = _optimize(tree, small=True) - return newtree - -# the set of valid characters for the initial letter of symbols in -# alias declarations and definitions -_aliassyminitletters = _syminitletters | set(pycompat.sysstr('$')) - -def _parsewith(spec, lookup=None, syminitletters=None): - """Generate a parse tree of given spec with given tokenizing options - - >>> _parsewith('foo($1)', syminitletters=_aliassyminitletters) - ('func', ('symbol', 'foo'), ('symbol', '$1')) - >>> _parsewith('$1') - Traceback (most recent call last): - ... - ParseError: ("syntax error in revset '$1'", 0) - >>> _parsewith('foo bar') - Traceback (most recent call last): - ... - ParseError: ('invalid token', 4) - """ - p = parser.parser(elements) - tree, pos = p.parse(tokenize(spec, lookup=lookup, - syminitletters=syminitletters)) - if pos != len(spec): - raise error.ParseError(_('invalid token'), pos) - return _fixops(parser.simplifyinfixops(tree, ('list', 'or'))) - -class _aliasrules(parser.basealiasrules): - """Parsing and expansion rule set of revset aliases""" - _section = _('revset alias') - - @staticmethod - def _parse(spec): - """Parse alias declaration/definition ``spec`` - - This allows symbol names to use also ``$`` as an initial letter - (for backward compatibility), and callers of this function should - examine whether ``$`` is used also for unexpected symbols or not. - """ - return _parsewith(spec, syminitletters=_aliassyminitletters) - - @staticmethod - def _trygetfunc(tree): - if tree[0] == 'func' and tree[1][0] == 'symbol': - return tree[1][1], getlist(tree[2]) - -def expandaliases(ui, tree): - aliases = _aliasrules.buildmap(ui.configitems('revsetalias')) - tree = _aliasrules.expand(aliases, tree) - # warn about problematic (but not referred) aliases - for name, alias in sorted(aliases.iteritems()): - if alias.error and not alias.warned: - ui.warn(_('warning: %s\n') % (alias.error)) - alias.warned = True - return tree - -def foldconcat(tree): - """Fold elements to be concatenated by `##` - """ - if not isinstance(tree, tuple) or tree[0] in ('string', 'symbol'): - return tree - if tree[0] == '_concat': - pending = [tree] - l = [] - while pending: - e = pending.pop() - if e[0] == '_concat': - pending.extend(reversed(e[1:])) - elif e[0] in ('string', 'symbol'): - l.append(e[1]) - else: - msg = _("\"##\" can't concatenate \"%s\" element") % (e[0]) - raise error.ParseError(msg) - return ('string', ''.join(l)) - else: - return tuple(foldconcat(t) for t in tree) - -def parse(spec, lookup=None): - return _parsewith(spec, lookup=lookup) - def posttreebuilthook(tree, repo): # hook for extensions to execute code on the optimized tree pass @@ -2801,15 +2249,16 @@ if repo: lookup = repo.__contains__ if len(specs) == 1: - tree = parse(specs[0], lookup) + tree = revsetlang.parse(specs[0], lookup) else: - tree = ('or', ('list',) + tuple(parse(s, lookup) for s in specs)) + tree = ('or', + ('list',) + tuple(revsetlang.parse(s, lookup) for s in specs)) if ui: - tree = expandaliases(ui, tree) - tree = foldconcat(tree) - tree = analyze(tree, order) - tree = optimize(tree) + tree = revsetlang.expandaliases(ui, tree) + tree = revsetlang.foldconcat(tree) + tree = revsetlang.analyze(tree, order) + tree = revsetlang.optimize(tree) posttreebuilthook(tree, repo) return makematcher(tree) @@ -2825,1082 +2274,6 @@ return result return mfunc -def formatspec(expr, *args): - ''' - This is a convenience function for using revsets internally, and - escapes arguments appropriately. Aliases are intentionally ignored - so that intended expression behavior isn't accidentally subverted. - - Supported arguments: - - %r = revset expression, parenthesized - %d = int(arg), no quoting - %s = string(arg), escaped and single-quoted - %b = arg.branch(), escaped and single-quoted - %n = hex(arg), single-quoted - %% = a literal '%' - - Prefixing the type with 'l' specifies a parenthesized list of that type. - - >>> formatspec('%r:: and %lr', '10 or 11', ("this()", "that()")) - '(10 or 11):: and ((this()) or (that()))' - >>> formatspec('%d:: and not %d::', 10, 20) - '10:: and not 20::' - >>> formatspec('%ld or %ld', [], [1]) - "_list('') or 1" - >>> formatspec('keyword(%s)', 'foo\\xe9') - "keyword('foo\\\\xe9')" - >>> b = lambda: 'default' - >>> b.branch = b - >>> formatspec('branch(%b)', b) - "branch('default')" - >>> formatspec('root(%ls)', ['a', 'b', 'c', 'd']) - "root(_list('a\\x00b\\x00c\\x00d'))" - ''' - - def quote(s): - return repr(str(s)) - - def argtype(c, arg): - if c == 'd': - return str(int(arg)) - elif c == 's': - return quote(arg) - elif c == 'r': - parse(arg) # make sure syntax errors are confined - return '(%s)' % arg - elif c == 'n': - return quote(node.hex(arg)) - elif c == 'b': - return quote(arg.branch()) - - def listexp(s, t): - l = len(s) - if l == 0: - return "_list('')" - elif l == 1: - return argtype(t, s[0]) - elif t == 'd': - return "_intlist('%s')" % "\0".join(str(int(a)) for a in s) - elif t == 's': - return "_list('%s')" % "\0".join(s) - elif t == 'n': - return "_hexlist('%s')" % "\0".join(node.hex(a) for a in s) - elif t == 'b': - return "_list('%s')" % "\0".join(a.branch() for a in s) - - m = l // 2 - return '(%s or %s)' % (listexp(s[:m], t), listexp(s[m:], t)) - - ret = '' - pos = 0 - arg = 0 - while pos < len(expr): - c = expr[pos] - if c == '%': - pos += 1 - d = expr[pos] - if d == '%': - ret += d - elif d in 'dsnbr': - ret += argtype(d, args[arg]) - arg += 1 - elif d == 'l': - # a list of some type - pos += 1 - d = expr[pos] - ret += listexp(list(args[arg]), d) - arg += 1 - else: - raise error.Abort(_('unexpected revspec format character %s') - % d) - else: - ret += c - pos += 1 - - return ret - -def prettyformat(tree): - return parser.prettyformat(tree, ('string', 'symbol')) - -def depth(tree): - if isinstance(tree, tuple): - return max(map(depth, tree)) + 1 - else: - return 0 - -def funcsused(tree): - if not isinstance(tree, tuple) or tree[0] in ('string', 'symbol'): - return set() - else: - funcs = set() - for s in tree[1:]: - funcs |= funcsused(s) - if tree[0] == 'func': - funcs.add(tree[1][1]) - return funcs - -def _formatsetrepr(r): - """Format an optional printable representation of a set - - ======== ================================= - type(r) example - ======== ================================= - tuple ('', other) - str '' - callable lambda: '' % sorted(b) - object other - ======== ================================= - """ - if r is None: - return '' - elif isinstance(r, tuple): - return r[0] % r[1:] - elif isinstance(r, str): - return r - elif callable(r): - return r() - else: - return repr(r) - -class abstractsmartset(object): - - def __nonzero__(self): - """True if the smartset is not empty""" - raise NotImplementedError() - - def __contains__(self, rev): - """provide fast membership testing""" - raise NotImplementedError() - - def __iter__(self): - """iterate the set in the order it is supposed to be iterated""" - raise NotImplementedError() - - # Attributes containing a function to perform a fast iteration in a given - # direction. A smartset can have none, one, or both defined. - # - # Default value is None instead of a function returning None to avoid - # initializing an iterator just for testing if a fast method exists. - fastasc = None - fastdesc = None - - def isascending(self): - """True if the set will iterate in ascending order""" - raise NotImplementedError() - - def isdescending(self): - """True if the set will iterate in descending order""" - raise NotImplementedError() - - def istopo(self): - """True if the set will iterate in topographical order""" - raise NotImplementedError() - - def min(self): - """return the minimum element in the set""" - if self.fastasc is None: - v = min(self) - else: - for v in self.fastasc(): - break - else: - raise ValueError('arg is an empty sequence') - self.min = lambda: v - return v - - def max(self): - """return the maximum element in the set""" - if self.fastdesc is None: - return max(self) - else: - for v in self.fastdesc(): - break - else: - raise ValueError('arg is an empty sequence') - self.max = lambda: v - return v - - def first(self): - """return the first element in the set (user iteration perspective) - - Return None if the set is empty""" - raise NotImplementedError() - - def last(self): - """return the last element in the set (user iteration perspective) - - Return None if the set is empty""" - raise NotImplementedError() - - def __len__(self): - """return the length of the smartsets - - This can be expensive on smartset that could be lazy otherwise.""" - raise NotImplementedError() - - def reverse(self): - """reverse the expected iteration order""" - raise NotImplementedError() - - def sort(self, reverse=True): - """get the set to iterate in an ascending or descending order""" - raise NotImplementedError() - - def __and__(self, other): - """Returns a new object with the intersection of the two collections. - - This is part of the mandatory API for smartset.""" - if isinstance(other, fullreposet): - return self - return self.filter(other.__contains__, condrepr=other, cache=False) - - def __add__(self, other): - """Returns a new object with the union of the two collections. - - This is part of the mandatory API for smartset.""" - return addset(self, other) - - def __sub__(self, other): - """Returns a new object with the substraction of the two collections. - - This is part of the mandatory API for smartset.""" - c = other.__contains__ - return self.filter(lambda r: not c(r), condrepr=('', other), - cache=False) - - def filter(self, condition, condrepr=None, cache=True): - """Returns this smartset filtered by condition as a new smartset. - - `condition` is a callable which takes a revision number and returns a - boolean. Optional `condrepr` provides a printable representation of - the given `condition`. - - This is part of the mandatory API for smartset.""" - # builtin cannot be cached. but do not needs to - if cache and util.safehasattr(condition, 'func_code'): - condition = util.cachefunc(condition) - return filteredset(self, condition, condrepr) - -class baseset(abstractsmartset): - """Basic data structure that represents a revset and contains the basic - operation that it should be able to perform. - - Every method in this class should be implemented by any smartset class. - """ - def __init__(self, data=(), datarepr=None, istopo=False): - """ - datarepr: a tuple of (format, obj, ...), a function or an object that - provides a printable representation of the given data. - """ - self._ascending = None - self._istopo = istopo - if not isinstance(data, list): - if isinstance(data, set): - self._set = data - # set has no order we pick one for stability purpose - self._ascending = True - data = list(data) - self._list = data - self._datarepr = datarepr - - @util.propertycache - def _set(self): - return set(self._list) - - @util.propertycache - def _asclist(self): - asclist = self._list[:] - asclist.sort() - return asclist - - def __iter__(self): - if self._ascending is None: - return iter(self._list) - elif self._ascending: - return iter(self._asclist) - else: - return reversed(self._asclist) - - def fastasc(self): - return iter(self._asclist) - - def fastdesc(self): - return reversed(self._asclist) - - @util.propertycache - def __contains__(self): - return self._set.__contains__ - - def __nonzero__(self): - return bool(self._list) - - def sort(self, reverse=False): - self._ascending = not bool(reverse) - self._istopo = False - - def reverse(self): - if self._ascending is None: - self._list.reverse() - else: - self._ascending = not self._ascending - self._istopo = False - - def __len__(self): - return len(self._list) - - def isascending(self): - """Returns True if the collection is ascending order, False if not. - - This is part of the mandatory API for smartset.""" - if len(self) <= 1: - return True - return self._ascending is not None and self._ascending - - def isdescending(self): - """Returns True if the collection is descending order, False if not. - - This is part of the mandatory API for smartset.""" - if len(self) <= 1: - return True - return self._ascending is not None and not self._ascending - - def istopo(self): - """Is the collection is in topographical order or not. - - This is part of the mandatory API for smartset.""" - if len(self) <= 1: - return True - return self._istopo - - def first(self): - if self: - if self._ascending is None: - return self._list[0] - elif self._ascending: - return self._asclist[0] - else: - return self._asclist[-1] - return None - - def last(self): - if self: - if self._ascending is None: - return self._list[-1] - elif self._ascending: - return self._asclist[-1] - else: - return self._asclist[0] - return None - - def __repr__(self): - d = {None: '', False: '-', True: '+'}[self._ascending] - s = _formatsetrepr(self._datarepr) - if not s: - l = self._list - # if _list has been built from a set, it might have a different - # order from one python implementation to another. - # We fallback to the sorted version for a stable output. - if self._ascending is not None: - l = self._asclist - s = repr(l) - return '<%s%s %s>' % (type(self).__name__, d, s) - -class filteredset(abstractsmartset): - """Duck type for baseset class which iterates lazily over the revisions in - the subset and contains a function which tests for membership in the - revset - """ - def __init__(self, subset, condition=lambda x: True, condrepr=None): - """ - condition: a function that decide whether a revision in the subset - belongs to the revset or not. - condrepr: a tuple of (format, obj, ...), a function or an object that - provides a printable representation of the given condition. - """ - self._subset = subset - self._condition = condition - self._condrepr = condrepr - - def __contains__(self, x): - return x in self._subset and self._condition(x) - - def __iter__(self): - return self._iterfilter(self._subset) - - def _iterfilter(self, it): - cond = self._condition - for x in it: - if cond(x): - yield x - - @property - def fastasc(self): - it = self._subset.fastasc - if it is None: - return None - return lambda: self._iterfilter(it()) - - @property - def fastdesc(self): - it = self._subset.fastdesc - if it is None: - return None - return lambda: self._iterfilter(it()) - - def __nonzero__(self): - fast = None - candidates = [self.fastasc if self.isascending() else None, - self.fastdesc if self.isdescending() else None, - self.fastasc, - self.fastdesc] - for candidate in candidates: - if candidate is not None: - fast = candidate - break - - if fast is not None: - it = fast() - else: - it = self - - for r in it: - return True - return False - - def __len__(self): - # Basic implementation to be changed in future patches. - # until this gets improved, we use generator expression - # here, since list comprehensions are free to call __len__ again - # causing infinite recursion - l = baseset(r for r in self) - return len(l) - - def sort(self, reverse=False): - self._subset.sort(reverse=reverse) - - def reverse(self): - self._subset.reverse() - - def isascending(self): - return self._subset.isascending() - - def isdescending(self): - return self._subset.isdescending() - - def istopo(self): - return self._subset.istopo() - - def first(self): - for x in self: - return x - return None - - def last(self): - it = None - if self.isascending(): - it = self.fastdesc - elif self.isdescending(): - it = self.fastasc - if it is not None: - for x in it(): - return x - return None #empty case - else: - x = None - for x in self: - pass - return x - - def __repr__(self): - xs = [repr(self._subset)] - s = _formatsetrepr(self._condrepr) - if s: - xs.append(s) - return '<%s %s>' % (type(self).__name__, ', '.join(xs)) - -def _iterordered(ascending, iter1, iter2): - """produce an ordered iteration from two iterators with the same order - - The ascending is used to indicated the iteration direction. - """ - choice = max - if ascending: - choice = min - - val1 = None - val2 = None - try: - # Consume both iterators in an ordered way until one is empty - while True: - if val1 is None: - val1 = next(iter1) - if val2 is None: - val2 = next(iter2) - n = choice(val1, val2) - yield n - if val1 == n: - val1 = None - if val2 == n: - val2 = None - except StopIteration: - # Flush any remaining values and consume the other one - it = iter2 - if val1 is not None: - yield val1 - it = iter1 - elif val2 is not None: - # might have been equality and both are empty - yield val2 - for val in it: - yield val - -class addset(abstractsmartset): - """Represent the addition of two sets - - Wrapper structure for lazily adding two structures without losing much - performance on the __contains__ method - - If the ascending attribute is set, that means the two structures are - ordered in either an ascending or descending way. Therefore, we can add - them maintaining the order by iterating over both at the same time - - >>> xs = baseset([0, 3, 2]) - >>> ys = baseset([5, 2, 4]) - - >>> rs = addset(xs, ys) - >>> bool(rs), 0 in rs, 1 in rs, 5 in rs, rs.first(), rs.last() - (True, True, False, True, 0, 4) - >>> rs = addset(xs, baseset([])) - >>> bool(rs), 0 in rs, 1 in rs, rs.first(), rs.last() - (True, True, False, 0, 2) - >>> rs = addset(baseset([]), baseset([])) - >>> bool(rs), 0 in rs, rs.first(), rs.last() - (False, False, None, None) - - iterate unsorted: - >>> rs = addset(xs, ys) - >>> # (use generator because pypy could call len()) - >>> list(x for x in rs) # without _genlist - [0, 3, 2, 5, 4] - >>> assert not rs._genlist - >>> len(rs) - 5 - >>> [x for x in rs] # with _genlist - [0, 3, 2, 5, 4] - >>> assert rs._genlist - - iterate ascending: - >>> rs = addset(xs, ys, ascending=True) - >>> # (use generator because pypy could call len()) - >>> list(x for x in rs), list(x for x in rs.fastasc()) # without _asclist - ([0, 2, 3, 4, 5], [0, 2, 3, 4, 5]) - >>> assert not rs._asclist - >>> len(rs) - 5 - >>> [x for x in rs], [x for x in rs.fastasc()] - ([0, 2, 3, 4, 5], [0, 2, 3, 4, 5]) - >>> assert rs._asclist - - iterate descending: - >>> rs = addset(xs, ys, ascending=False) - >>> # (use generator because pypy could call len()) - >>> list(x for x in rs), list(x for x in rs.fastdesc()) # without _asclist - ([5, 4, 3, 2, 0], [5, 4, 3, 2, 0]) - >>> assert not rs._asclist - >>> len(rs) - 5 - >>> [x for x in rs], [x for x in rs.fastdesc()] - ([5, 4, 3, 2, 0], [5, 4, 3, 2, 0]) - >>> assert rs._asclist - - iterate ascending without fastasc: - >>> rs = addset(xs, generatorset(ys), ascending=True) - >>> assert rs.fastasc is None - >>> [x for x in rs] - [0, 2, 3, 4, 5] - - iterate descending without fastdesc: - >>> rs = addset(generatorset(xs), ys, ascending=False) - >>> assert rs.fastdesc is None - >>> [x for x in rs] - [5, 4, 3, 2, 0] - """ - def __init__(self, revs1, revs2, ascending=None): - self._r1 = revs1 - self._r2 = revs2 - self._iter = None - self._ascending = ascending - self._genlist = None - self._asclist = None - - def __len__(self): - return len(self._list) - - def __nonzero__(self): - return bool(self._r1) or bool(self._r2) - - @util.propertycache - def _list(self): - if not self._genlist: - self._genlist = baseset(iter(self)) - return self._genlist - - def __iter__(self): - """Iterate over both collections without repeating elements - - If the ascending attribute is not set, iterate over the first one and - then over the second one checking for membership on the first one so we - dont yield any duplicates. - - If the ascending attribute is set, iterate over both collections at the - same time, yielding only one value at a time in the given order. - """ - if self._ascending is None: - if self._genlist: - return iter(self._genlist) - def arbitraryordergen(): - for r in self._r1: - yield r - inr1 = self._r1.__contains__ - for r in self._r2: - if not inr1(r): - yield r - return arbitraryordergen() - # try to use our own fast iterator if it exists - self._trysetasclist() - if self._ascending: - attr = 'fastasc' - else: - attr = 'fastdesc' - it = getattr(self, attr) - if it is not None: - return it() - # maybe half of the component supports fast - # get iterator for _r1 - iter1 = getattr(self._r1, attr) - if iter1 is None: - # let's avoid side effect (not sure it matters) - iter1 = iter(sorted(self._r1, reverse=not self._ascending)) - else: - iter1 = iter1() - # get iterator for _r2 - iter2 = getattr(self._r2, attr) - if iter2 is None: - # let's avoid side effect (not sure it matters) - iter2 = iter(sorted(self._r2, reverse=not self._ascending)) - else: - iter2 = iter2() - return _iterordered(self._ascending, iter1, iter2) - - def _trysetasclist(self): - """populate the _asclist attribute if possible and necessary""" - if self._genlist is not None and self._asclist is None: - self._asclist = sorted(self._genlist) - - @property - def fastasc(self): - self._trysetasclist() - if self._asclist is not None: - return self._asclist.__iter__ - iter1 = self._r1.fastasc - iter2 = self._r2.fastasc - if None in (iter1, iter2): - return None - return lambda: _iterordered(True, iter1(), iter2()) - - @property - def fastdesc(self): - self._trysetasclist() - if self._asclist is not None: - return self._asclist.__reversed__ - iter1 = self._r1.fastdesc - iter2 = self._r2.fastdesc - if None in (iter1, iter2): - return None - return lambda: _iterordered(False, iter1(), iter2()) - - def __contains__(self, x): - return x in self._r1 or x in self._r2 - - def sort(self, reverse=False): - """Sort the added set - - For this we use the cached list with all the generated values and if we - know they are ascending or descending we can sort them in a smart way. - """ - self._ascending = not reverse - - def isascending(self): - return self._ascending is not None and self._ascending - - def isdescending(self): - return self._ascending is not None and not self._ascending - - def istopo(self): - # not worth the trouble asserting if the two sets combined are still - # in topographical order. Use the sort() predicate to explicitly sort - # again instead. - return False - - def reverse(self): - if self._ascending is None: - self._list.reverse() - else: - self._ascending = not self._ascending - - def first(self): - for x in self: - return x - return None - - def last(self): - self.reverse() - val = self.first() - self.reverse() - return val - - def __repr__(self): - d = {None: '', False: '-', True: '+'}[self._ascending] - return '<%s%s %r, %r>' % (type(self).__name__, d, self._r1, self._r2) - -class generatorset(abstractsmartset): - """Wrap a generator for lazy iteration - - Wrapper structure for generators that provides lazy membership and can - be iterated more than once. - When asked for membership it generates values until either it finds the - requested one or has gone through all the elements in the generator - """ - def __init__(self, gen, iterasc=None): - """ - gen: a generator producing the values for the generatorset. - """ - self._gen = gen - self._asclist = None - self._cache = {} - self._genlist = [] - self._finished = False - self._ascending = True - if iterasc is not None: - if iterasc: - self.fastasc = self._iterator - self.__contains__ = self._asccontains - else: - self.fastdesc = self._iterator - self.__contains__ = self._desccontains - - def __nonzero__(self): - # Do not use 'for r in self' because it will enforce the iteration - # order (default ascending), possibly unrolling a whole descending - # iterator. - if self._genlist: - return True - for r in self._consumegen(): - return True - return False - - def __contains__(self, x): - if x in self._cache: - return self._cache[x] - - # Use new values only, as existing values would be cached. - for l in self._consumegen(): - if l == x: - return True - - self._cache[x] = False - return False - - def _asccontains(self, x): - """version of contains optimised for ascending generator""" - if x in self._cache: - return self._cache[x] - - # Use new values only, as existing values would be cached. - for l in self._consumegen(): - if l == x: - return True - if l > x: - break - - self._cache[x] = False - return False - - def _desccontains(self, x): - """version of contains optimised for descending generator""" - if x in self._cache: - return self._cache[x] - - # Use new values only, as existing values would be cached. - for l in self._consumegen(): - if l == x: - return True - if l < x: - break - - self._cache[x] = False - return False - - def __iter__(self): - if self._ascending: - it = self.fastasc - else: - it = self.fastdesc - if it is not None: - return it() - # we need to consume the iterator - for x in self._consumegen(): - pass - # recall the same code - return iter(self) - - def _iterator(self): - if self._finished: - return iter(self._genlist) - - # We have to use this complex iteration strategy to allow multiple - # iterations at the same time. We need to be able to catch revision - # removed from _consumegen and added to genlist in another instance. - # - # Getting rid of it would provide an about 15% speed up on this - # iteration. - genlist = self._genlist - nextrev = self._consumegen().next - _len = len # cache global lookup - def gen(): - i = 0 - while True: - if i < _len(genlist): - yield genlist[i] - else: - yield nextrev() - i += 1 - return gen() - - def _consumegen(self): - cache = self._cache - genlist = self._genlist.append - for item in self._gen: - cache[item] = True - genlist(item) - yield item - if not self._finished: - self._finished = True - asc = self._genlist[:] - asc.sort() - self._asclist = asc - self.fastasc = asc.__iter__ - self.fastdesc = asc.__reversed__ - - def __len__(self): - for x in self._consumegen(): - pass - return len(self._genlist) - - def sort(self, reverse=False): - self._ascending = not reverse - - def reverse(self): - self._ascending = not self._ascending - - def isascending(self): - return self._ascending - - def isdescending(self): - return not self._ascending - - def istopo(self): - # not worth the trouble asserting if the two sets combined are still - # in topographical order. Use the sort() predicate to explicitly sort - # again instead. - return False - - def first(self): - if self._ascending: - it = self.fastasc - else: - it = self.fastdesc - if it is None: - # we need to consume all and try again - for x in self._consumegen(): - pass - return self.first() - return next(it(), None) - - def last(self): - if self._ascending: - it = self.fastdesc - else: - it = self.fastasc - if it is None: - # we need to consume all and try again - for x in self._consumegen(): - pass - return self.first() - return next(it(), None) - - def __repr__(self): - d = {False: '-', True: '+'}[self._ascending] - return '<%s%s>' % (type(self).__name__, d) - -class spanset(abstractsmartset): - """Duck type for baseset class which represents a range of revisions and - can work lazily and without having all the range in memory - - Note that spanset(x, y) behave almost like xrange(x, y) except for two - notable points: - - when x < y it will be automatically descending, - - revision filtered with this repoview will be skipped. - - """ - def __init__(self, repo, start=0, end=None): - """ - start: first revision included the set - (default to 0) - end: first revision excluded (last+1) - (default to len(repo) - - Spanset will be descending if `end` < `start`. - """ - if end is None: - end = len(repo) - self._ascending = start <= end - if not self._ascending: - start, end = end + 1, start +1 - self._start = start - self._end = end - self._hiddenrevs = repo.changelog.filteredrevs - - def sort(self, reverse=False): - self._ascending = not reverse - - def reverse(self): - self._ascending = not self._ascending - - def istopo(self): - # not worth the trouble asserting if the two sets combined are still - # in topographical order. Use the sort() predicate to explicitly sort - # again instead. - return False - - def _iterfilter(self, iterrange): - s = self._hiddenrevs - for r in iterrange: - if r not in s: - yield r - - def __iter__(self): - if self._ascending: - return self.fastasc() - else: - return self.fastdesc() - - def fastasc(self): - iterrange = xrange(self._start, self._end) - if self._hiddenrevs: - return self._iterfilter(iterrange) - return iter(iterrange) - - def fastdesc(self): - iterrange = xrange(self._end - 1, self._start - 1, -1) - if self._hiddenrevs: - return self._iterfilter(iterrange) - return iter(iterrange) - - def __contains__(self, rev): - hidden = self._hiddenrevs - return ((self._start <= rev < self._end) - and not (hidden and rev in hidden)) - - def __nonzero__(self): - for r in self: - return True - return False - - def __len__(self): - if not self._hiddenrevs: - return abs(self._end - self._start) - else: - count = 0 - start = self._start - end = self._end - for rev in self._hiddenrevs: - if (end < rev <= start) or (start <= rev < end): - count += 1 - return abs(self._end - self._start) - count - - def isascending(self): - return self._ascending - - def isdescending(self): - return not self._ascending - - def first(self): - if self._ascending: - it = self.fastasc - else: - it = self.fastdesc - for x in it(): - return x - return None - - def last(self): - if self._ascending: - it = self.fastdesc - else: - it = self.fastasc - for x in it(): - return x - return None - - def __repr__(self): - d = {False: '-', True: '+'}[self._ascending] - return '<%s%s %d:%d>' % (type(self).__name__, d, - self._start, self._end - 1) - -class fullreposet(spanset): - """a set containing all revisions in the repo - - This class exists to host special optimization and magic to handle virtual - revisions such as "null". - """ - - def __init__(self, repo): - super(fullreposet, self).__init__(repo) - - def __and__(self, other): - """As self contains the whole repo, all of the other set should also be - in self. Therefore `self & other = other`. - - This boldly assumes the other contains valid revs only. - """ - # other not a smartset, make is so - if not util.safehasattr(other, 'isascending'): - # filter out hidden revision - # (this boldly assumes all smartset are pure) - # - # `other` was used with "&", let's assume this is a set like - # object. - other = baseset(other - self._hiddenrevs) - - other.sort(reverse=self.isdescending()) - return other - -def prettyformatset(revs): - lines = [] - rs = repr(revs) - p = 0 - while p < len(rs): - q = rs.find('<', p + 1) - if q < 0: - q = len(rs) - l = rs.count('<', 0, p) - rs.count('>', 0, p) - assert l >= 0 - lines.append((l, rs[p:q].rstrip())) - p = q - return '\n'.join(' ' * l + s for l, s in lines) - def loadpredicate(ui, extname, registrarobj): """Load revset predicates from specified registrarobj """ diff -r ed5b25874d99 -r 4baf79a77afa mercurial/revsetlang.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/revsetlang.py Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,702 @@ +# revsetlang.py - parser, tokenizer and utility for revision set language +# +# Copyright 2010 Matt Mackall +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. + +from __future__ import absolute_import + +import string + +from .i18n import _ +from . import ( + error, + node, + parser, + pycompat, + util, +) + +elements = { + # token-type: binding-strength, primary, prefix, infix, suffix + "(": (21, None, ("group", 1, ")"), ("func", 1, ")"), None), + "##": (20, None, None, ("_concat", 20), None), + "~": (18, None, None, ("ancestor", 18), None), + "^": (18, None, None, ("parent", 18), "parentpost"), + "-": (5, None, ("negate", 19), ("minus", 5), None), + "::": (17, None, ("dagrangepre", 17), ("dagrange", 17), "dagrangepost"), + "..": (17, None, ("dagrangepre", 17), ("dagrange", 17), "dagrangepost"), + ":": (15, "rangeall", ("rangepre", 15), ("range", 15), "rangepost"), + "not": (10, None, ("not", 10), None, None), + "!": (10, None, ("not", 10), None, None), + "and": (5, None, None, ("and", 5), None), + "&": (5, None, None, ("and", 5), None), + "%": (5, None, None, ("only", 5), "onlypost"), + "or": (4, None, None, ("or", 4), None), + "|": (4, None, None, ("or", 4), None), + "+": (4, None, None, ("or", 4), None), + "=": (3, None, None, ("keyvalue", 3), None), + ",": (2, None, None, ("list", 2), None), + ")": (0, None, None, None, None), + "symbol": (0, "symbol", None, None, None), + "string": (0, "string", None, None, None), + "end": (0, None, None, None, None), +} + +keywords = set(['and', 'or', 'not']) + +_quoteletters = set(['"', "'"]) +_simpleopletters = set(pycompat.iterbytestr("():=,-|&+!~^%")) + +# default set of valid characters for the initial letter of symbols +_syminitletters = set(pycompat.iterbytestr( + string.ascii_letters.encode('ascii') + + string.digits.encode('ascii') + + '._@')) | set(map(pycompat.bytechr, xrange(128, 256))) + +# default set of valid characters for non-initial letters of symbols +_symletters = _syminitletters | set(pycompat.iterbytestr('-/')) + +def tokenize(program, lookup=None, syminitletters=None, symletters=None): + ''' + Parse a revset statement into a stream of tokens + + ``syminitletters`` is the set of valid characters for the initial + letter of symbols. + + By default, character ``c`` is recognized as valid for initial + letter of symbols, if ``c.isalnum() or c in '._@' or ord(c) > 127``. + + ``symletters`` is the set of valid characters for non-initial + letters of symbols. + + By default, character ``c`` is recognized as valid for non-initial + letters of symbols, if ``c.isalnum() or c in '-._/@' or ord(c) > 127``. + + Check that @ is a valid unquoted token character (issue3686): + >>> list(tokenize("@::")) + [('symbol', '@', 0), ('::', None, 1), ('end', None, 3)] + + ''' + program = pycompat.bytestr(program) + if syminitletters is None: + syminitletters = _syminitletters + if symletters is None: + symletters = _symletters + + if program and lookup: + # attempt to parse old-style ranges first to deal with + # things like old-tag which contain query metacharacters + parts = program.split(':', 1) + if all(lookup(sym) for sym in parts if sym): + if parts[0]: + yield ('symbol', parts[0], 0) + if len(parts) > 1: + s = len(parts[0]) + yield (':', None, s) + if parts[1]: + yield ('symbol', parts[1], s + 1) + yield ('end', None, len(program)) + return + + pos, l = 0, len(program) + while pos < l: + c = program[pos] + if c.isspace(): # skip inter-token whitespace + pass + elif c == ':' and program[pos:pos + 2] == '::': # look ahead carefully + yield ('::', None, pos) + pos += 1 # skip ahead + elif c == '.' and program[pos:pos + 2] == '..': # look ahead carefully + yield ('..', None, pos) + pos += 1 # skip ahead + elif c == '#' and program[pos:pos + 2] == '##': # look ahead carefully + yield ('##', None, pos) + pos += 1 # skip ahead + elif c in _simpleopletters: # handle simple operators + yield (c, None, pos) + elif (c in _quoteletters or c == 'r' and + program[pos:pos + 2] in ("r'", 'r"')): # handle quoted strings + if c == 'r': + pos += 1 + c = program[pos] + decode = lambda x: x + else: + decode = parser.unescapestr + pos += 1 + s = pos + while pos < l: # find closing quote + d = program[pos] + if d == '\\': # skip over escaped characters + pos += 2 + continue + if d == c: + yield ('string', decode(program[s:pos]), s) + break + pos += 1 + else: + raise error.ParseError(_("unterminated string"), s) + # gather up a symbol/keyword + elif c in syminitletters: + s = pos + pos += 1 + while pos < l: # find end of symbol + d = program[pos] + if d not in symletters: + break + if d == '.' and program[pos - 1] == '.': # special case for .. + pos -= 1 + break + pos += 1 + sym = program[s:pos] + if sym in keywords: # operator keywords + yield (sym, None, s) + elif '-' in sym: + # some jerk gave us foo-bar-baz, try to check if it's a symbol + if lookup and lookup(sym): + # looks like a real symbol + yield ('symbol', sym, s) + else: + # looks like an expression + parts = sym.split('-') + for p in parts[:-1]: + if p: # possible consecutive - + yield ('symbol', p, s) + s += len(p) + yield ('-', None, pos) + s += 1 + if parts[-1]: # possible trailing - + yield ('symbol', parts[-1], s) + else: + yield ('symbol', sym, s) + pos -= 1 + else: + raise error.ParseError(_("syntax error in revset '%s'") % + program, pos) + pos += 1 + yield ('end', None, pos) + +# helpers + +_notset = object() + +def getsymbol(x): + if x and x[0] == 'symbol': + return x[1] + raise error.ParseError(_('not a symbol')) + +def getstring(x, err): + if x and (x[0] == 'string' or x[0] == 'symbol'): + return x[1] + raise error.ParseError(err) + +def getinteger(x, err, default=_notset): + if not x and default is not _notset: + return default + try: + return int(getstring(x, err)) + except ValueError: + raise error.ParseError(err) + +def getlist(x): + if not x: + return [] + if x[0] == 'list': + return list(x[1:]) + return [x] + +def getrange(x, err): + if not x: + raise error.ParseError(err) + op = x[0] + if op == 'range': + return x[1], x[2] + elif op == 'rangepre': + return None, x[1] + elif op == 'rangepost': + return x[1], None + elif op == 'rangeall': + return None, None + raise error.ParseError(err) + +def getargs(x, min, max, err): + l = getlist(x) + if len(l) < min or (max >= 0 and len(l) > max): + raise error.ParseError(err) + return l + +def getargsdict(x, funcname, keys): + return parser.buildargsdict(getlist(x), funcname, parser.splitargspec(keys), + keyvaluenode='keyvalue', keynode='symbol') + +# Constants for ordering requirement, used in _analyze(): +# +# If 'define', any nested functions and operations can change the ordering of +# the entries in the set. If 'follow', any nested functions and operations +# should take the ordering specified by the first operand to the '&' operator. +# +# For instance, +# +# X & (Y | Z) +# ^ ^^^^^^^ +# | follow +# define +# +# will be evaluated as 'or(y(x()), z(x()))', where 'x()' can change the order +# of the entries in the set, but 'y()', 'z()' and 'or()' shouldn't. +# +# 'any' means the order doesn't matter. For instance, +# +# X & !Y +# ^ +# any +# +# 'y()' can either enforce its ordering requirement or take the ordering +# specified by 'x()' because 'not()' doesn't care the order. +# +# Transition of ordering requirement: +# +# 1. starts with 'define' +# 2. shifts to 'follow' by 'x & y' +# 3. changes back to 'define' on function call 'f(x)' or function-like +# operation 'x (f) y' because 'f' may have its own ordering requirement +# for 'x' and 'y' (e.g. 'first(x)') +# +anyorder = 'any' # don't care the order +defineorder = 'define' # should define the order +followorder = 'follow' # must follow the current order + +# transition table for 'x & y', from the current expression 'x' to 'y' +_tofolloworder = { + anyorder: anyorder, + defineorder: followorder, + followorder: followorder, +} + +def _matchonly(revs, bases): + """ + >>> f = lambda *args: _matchonly(*map(parse, args)) + >>> f('ancestors(A)', 'not ancestors(B)') + ('list', ('symbol', 'A'), ('symbol', 'B')) + """ + if (revs is not None + and revs[0] == 'func' + and getsymbol(revs[1]) == 'ancestors' + and bases is not None + and bases[0] == 'not' + and bases[1][0] == 'func' + and getsymbol(bases[1][1]) == 'ancestors'): + return ('list', revs[2], bases[1][2]) + +def _fixops(x): + """Rewrite raw parsed tree to resolve ambiguous syntax which cannot be + handled well by our simple top-down parser""" + if not isinstance(x, tuple): + return x + + op = x[0] + if op == 'parent': + # x^:y means (x^) : y, not x ^ (:y) + # x^: means (x^) :, not x ^ (:) + post = ('parentpost', x[1]) + if x[2][0] == 'dagrangepre': + return _fixops(('dagrange', post, x[2][1])) + elif x[2][0] == 'rangepre': + return _fixops(('range', post, x[2][1])) + elif x[2][0] == 'rangeall': + return _fixops(('rangepost', post)) + elif op == 'or': + # make number of arguments deterministic: + # x + y + z -> (or x y z) -> (or (list x y z)) + return (op, _fixops(('list',) + x[1:])) + + return (op,) + tuple(_fixops(y) for y in x[1:]) + +def _analyze(x, order): + if x is None: + return x + + op = x[0] + if op == 'minus': + return _analyze(('and', x[1], ('not', x[2])), order) + elif op == 'only': + t = ('func', ('symbol', 'only'), ('list', x[1], x[2])) + return _analyze(t, order) + elif op == 'onlypost': + return _analyze(('func', ('symbol', 'only'), x[1]), order) + elif op == 'dagrangepre': + return _analyze(('func', ('symbol', 'ancestors'), x[1]), order) + elif op == 'dagrangepost': + return _analyze(('func', ('symbol', 'descendants'), x[1]), order) + elif op == 'negate': + s = getstring(x[1], _("can't negate that")) + return _analyze(('string', '-' + s), order) + elif op in ('string', 'symbol'): + return x + elif op == 'and': + ta = _analyze(x[1], order) + tb = _analyze(x[2], _tofolloworder[order]) + return (op, ta, tb, order) + elif op == 'or': + return (op, _analyze(x[1], order), order) + elif op == 'not': + return (op, _analyze(x[1], anyorder), order) + elif op == 'rangeall': + return (op, None, order) + elif op in ('rangepre', 'rangepost', 'parentpost'): + return (op, _analyze(x[1], defineorder), order) + elif op == 'group': + return _analyze(x[1], order) + elif op in ('dagrange', 'range', 'parent', 'ancestor'): + ta = _analyze(x[1], defineorder) + tb = _analyze(x[2], defineorder) + return (op, ta, tb, order) + elif op == 'list': + return (op,) + tuple(_analyze(y, order) for y in x[1:]) + elif op == 'keyvalue': + return (op, x[1], _analyze(x[2], order)) + elif op == 'func': + f = getsymbol(x[1]) + d = defineorder + if f == 'present': + # 'present(set)' is known to return the argument set with no + # modification, so forward the current order to its argument + d = order + return (op, x[1], _analyze(x[2], d), order) + raise ValueError('invalid operator %r' % op) + +def analyze(x, order=defineorder): + """Transform raw parsed tree to evaluatable tree which can be fed to + optimize() or getset() + + All pseudo operations should be mapped to real operations or functions + defined in methods or symbols table respectively. + + 'order' specifies how the current expression 'x' is ordered (see the + constants defined above.) + """ + return _analyze(x, order) + +def _optimize(x, small): + if x is None: + return 0, x + + smallbonus = 1 + if small: + smallbonus = .5 + + op = x[0] + if op in ('string', 'symbol'): + return smallbonus, x # single revisions are small + elif op == 'and': + wa, ta = _optimize(x[1], True) + wb, tb = _optimize(x[2], True) + order = x[3] + w = min(wa, wb) + + # (::x and not ::y)/(not ::y and ::x) have a fast path + tm = _matchonly(ta, tb) or _matchonly(tb, ta) + if tm: + return w, ('func', ('symbol', 'only'), tm, order) + + if tb is not None and tb[0] == 'not': + return wa, ('difference', ta, tb[1], order) + + if wa > wb: + return w, (op, tb, ta, order) + return w, (op, ta, tb, order) + elif op == 'or': + # fast path for machine-generated expression, that is likely to have + # lots of trivial revisions: 'a + b + c()' to '_list(a b) + c()' + order = x[2] + ws, ts, ss = [], [], [] + def flushss(): + if not ss: + return + if len(ss) == 1: + w, t = ss[0] + else: + s = '\0'.join(t[1] for w, t in ss) + y = ('func', ('symbol', '_list'), ('string', s), order) + w, t = _optimize(y, False) + ws.append(w) + ts.append(t) + del ss[:] + for y in getlist(x[1]): + w, t = _optimize(y, False) + if t is not None and (t[0] == 'string' or t[0] == 'symbol'): + ss.append((w, t)) + continue + flushss() + ws.append(w) + ts.append(t) + flushss() + if len(ts) == 1: + return ws[0], ts[0] # 'or' operation is fully optimized out + # we can't reorder trees by weight because it would change the order. + # ("sort(a + b)" == "sort(b + a)", but "a + b" != "b + a") + # ts = tuple(t for w, t in sorted(zip(ws, ts), key=lambda wt: wt[0])) + return max(ws), (op, ('list',) + tuple(ts), order) + elif op == 'not': + # Optimize not public() to _notpublic() because we have a fast version + if x[1][:3] == ('func', ('symbol', 'public'), None): + order = x[1][3] + newsym = ('func', ('symbol', '_notpublic'), None, order) + o = _optimize(newsym, not small) + return o[0], o[1] + else: + o = _optimize(x[1], not small) + order = x[2] + return o[0], (op, o[1], order) + elif op == 'rangeall': + return smallbonus, x + elif op in ('rangepre', 'rangepost', 'parentpost'): + o = _optimize(x[1], small) + order = x[2] + return o[0], (op, o[1], order) + elif op in ('dagrange', 'range', 'parent', 'ancestor'): + wa, ta = _optimize(x[1], small) + wb, tb = _optimize(x[2], small) + order = x[3] + return wa + wb, (op, ta, tb, order) + elif op == 'list': + ws, ts = zip(*(_optimize(y, small) for y in x[1:])) + return sum(ws), (op,) + ts + elif op == 'keyvalue': + w, t = _optimize(x[2], small) + return w, (op, x[1], t) + elif op == 'func': + f = getsymbol(x[1]) + wa, ta = _optimize(x[2], small) + if f in ('author', 'branch', 'closed', 'date', 'desc', 'file', 'grep', + 'keyword', 'outgoing', 'user', 'destination'): + w = 10 # slow + elif f in ('modifies', 'adds', 'removes'): + w = 30 # slower + elif f == "contains": + w = 100 # very slow + elif f == "ancestor": + w = 1 * smallbonus + elif f in ('reverse', 'limit', 'first', 'wdir', '_intlist'): + w = 0 + elif f == "sort": + w = 10 # assume most sorts look at changelog + else: + w = 1 + order = x[3] + return w + wa, (op, x[1], ta, order) + raise ValueError('invalid operator %r' % op) + +def optimize(tree): + """Optimize evaluatable tree + + All pseudo operations should be transformed beforehand. + """ + _weight, newtree = _optimize(tree, small=True) + return newtree + +# the set of valid characters for the initial letter of symbols in +# alias declarations and definitions +_aliassyminitletters = _syminitletters | set(pycompat.sysstr('$')) + +def _parsewith(spec, lookup=None, syminitletters=None): + """Generate a parse tree of given spec with given tokenizing options + + >>> _parsewith('foo($1)', syminitletters=_aliassyminitletters) + ('func', ('symbol', 'foo'), ('symbol', '$1')) + >>> _parsewith('$1') + Traceback (most recent call last): + ... + ParseError: ("syntax error in revset '$1'", 0) + >>> _parsewith('foo bar') + Traceback (most recent call last): + ... + ParseError: ('invalid token', 4) + """ + p = parser.parser(elements) + tree, pos = p.parse(tokenize(spec, lookup=lookup, + syminitletters=syminitletters)) + if pos != len(spec): + raise error.ParseError(_('invalid token'), pos) + return _fixops(parser.simplifyinfixops(tree, ('list', 'or'))) + +class _aliasrules(parser.basealiasrules): + """Parsing and expansion rule set of revset aliases""" + _section = _('revset alias') + + @staticmethod + def _parse(spec): + """Parse alias declaration/definition ``spec`` + + This allows symbol names to use also ``$`` as an initial letter + (for backward compatibility), and callers of this function should + examine whether ``$`` is used also for unexpected symbols or not. + """ + return _parsewith(spec, syminitletters=_aliassyminitletters) + + @staticmethod + def _trygetfunc(tree): + if tree[0] == 'func' and tree[1][0] == 'symbol': + return tree[1][1], getlist(tree[2]) + +def expandaliases(ui, tree): + aliases = _aliasrules.buildmap(ui.configitems('revsetalias')) + tree = _aliasrules.expand(aliases, tree) + # warn about problematic (but not referred) aliases + for name, alias in sorted(aliases.iteritems()): + if alias.error and not alias.warned: + ui.warn(_('warning: %s\n') % (alias.error)) + alias.warned = True + return tree + +def foldconcat(tree): + """Fold elements to be concatenated by `##` + """ + if not isinstance(tree, tuple) or tree[0] in ('string', 'symbol'): + return tree + if tree[0] == '_concat': + pending = [tree] + l = [] + while pending: + e = pending.pop() + if e[0] == '_concat': + pending.extend(reversed(e[1:])) + elif e[0] in ('string', 'symbol'): + l.append(e[1]) + else: + msg = _("\"##\" can't concatenate \"%s\" element") % (e[0]) + raise error.ParseError(msg) + return ('string', ''.join(l)) + else: + return tuple(foldconcat(t) for t in tree) + +def parse(spec, lookup=None): + return _parsewith(spec, lookup=lookup) + +def _quote(s): + r"""Quote a value in order to make it safe for the revset engine. + + >>> _quote('asdf') + "'asdf'" + >>> _quote("asdf'\"") + '\'asdf\\\'"\'' + >>> _quote('asdf\'') + "'asdf\\''" + >>> _quote(1) + "'1'" + """ + return "'%s'" % util.escapestr('%s' % s) + +def formatspec(expr, *args): + ''' + This is a convenience function for using revsets internally, and + escapes arguments appropriately. Aliases are intentionally ignored + so that intended expression behavior isn't accidentally subverted. + + Supported arguments: + + %r = revset expression, parenthesized + %d = int(arg), no quoting + %s = string(arg), escaped and single-quoted + %b = arg.branch(), escaped and single-quoted + %n = hex(arg), single-quoted + %% = a literal '%' + + Prefixing the type with 'l' specifies a parenthesized list of that type. + + >>> formatspec('%r:: and %lr', '10 or 11', ("this()", "that()")) + '(10 or 11):: and ((this()) or (that()))' + >>> formatspec('%d:: and not %d::', 10, 20) + '10:: and not 20::' + >>> formatspec('%ld or %ld', [], [1]) + "_list('') or 1" + >>> formatspec('keyword(%s)', 'foo\\xe9') + "keyword('foo\\\\xe9')" + >>> b = lambda: 'default' + >>> b.branch = b + >>> formatspec('branch(%b)', b) + "branch('default')" + >>> formatspec('root(%ls)', ['a', 'b', 'c', 'd']) + "root(_list('a\\x00b\\x00c\\x00d'))" + ''' + + def argtype(c, arg): + if c == 'd': + return '%d' % int(arg) + elif c == 's': + return _quote(arg) + elif c == 'r': + parse(arg) # make sure syntax errors are confined + return '(%s)' % arg + elif c == 'n': + return _quote(node.hex(arg)) + elif c == 'b': + return _quote(arg.branch()) + + def listexp(s, t): + l = len(s) + if l == 0: + return "_list('')" + elif l == 1: + return argtype(t, s[0]) + elif t == 'd': + return "_intlist('%s')" % "\0".join('%d' % int(a) for a in s) + elif t == 's': + return "_list('%s')" % "\0".join(s) + elif t == 'n': + return "_hexlist('%s')" % "\0".join(node.hex(a) for a in s) + elif t == 'b': + return "_list('%s')" % "\0".join(a.branch() for a in s) + + m = l // 2 + return '(%s or %s)' % (listexp(s[:m], t), listexp(s[m:], t)) + + expr = pycompat.bytestr(expr) + ret = '' + pos = 0 + arg = 0 + while pos < len(expr): + c = expr[pos] + if c == '%': + pos += 1 + d = expr[pos] + if d == '%': + ret += d + elif d in 'dsnbr': + ret += argtype(d, args[arg]) + arg += 1 + elif d == 'l': + # a list of some type + pos += 1 + d = expr[pos] + ret += listexp(list(args[arg]), d) + arg += 1 + else: + raise error.Abort(_('unexpected revspec format character %s') + % d) + else: + ret += c + pos += 1 + + return ret + +def prettyformat(tree): + return parser.prettyformat(tree, ('string', 'symbol')) + +def depth(tree): + if isinstance(tree, tuple): + return max(map(depth, tree)) + 1 + else: + return 0 + +def funcsused(tree): + if not isinstance(tree, tuple) or tree[0] in ('string', 'symbol'): + return set() + else: + funcs = set() + for s in tree[1:]: + funcs |= funcsused(s) + if tree[0] == 'func': + funcs.add(tree[1][1]) + return funcs diff -r ed5b25874d99 -r 4baf79a77afa mercurial/scmposix.py --- a/mercurial/scmposix.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/scmposix.py Fri Mar 24 08:37:26 2017 -0700 @@ -40,8 +40,15 @@ def userrcpath(): if pycompat.sysplatform == 'plan9': return [encoding.environ['home'] + '/lib/hgrc'] + elif pycompat.sysplatform == 'darwin': + return [os.path.expanduser('~/.hgrc')] else: - return [os.path.expanduser('~/.hgrc')] + confighome = encoding.environ.get('XDG_CONFIG_HOME') + if confighome is None or not os.path.isabs(confighome): + confighome = os.path.expanduser('~/.config') + + return [os.path.expanduser('~/.hgrc'), + os.path.join(confighome, 'hg', 'hgrc')] def termsize(ui): try: @@ -59,7 +66,7 @@ if not os.isatty(fd): continue arri = fcntl.ioctl(fd, TIOCGWINSZ, '\0' * 8) - height, width = array.array('h', arri)[:2] + height, width = array.array(r'h', arri)[:2] if width > 0 and height > 0: return width, height except ValueError: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/scmutil.py --- a/mercurial/scmutil.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/scmutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,17 +7,12 @@ from __future__ import absolute_import -import contextlib import errno import glob import hashlib import os import re -import shutil import socket -import stat -import tempfile -import threading from .i18n import _ from .node import wdirrev @@ -29,9 +24,10 @@ pathutil, phases, pycompat, - revset, + revsetlang, similar, util, + vfs as vfsmod, ) if pycompat.osname == 'nt': @@ -332,459 +328,20 @@ if revs: s = hashlib.sha1() for rev in revs: - s.update('%s;' % rev) + s.update('%d;' % rev) key = s.digest() return key -class abstractvfs(object): - """Abstract base class; cannot be instantiated""" - - def __init__(self, *args, **kwargs): - '''Prevent instantiation; don't call this from subclasses.''' - raise NotImplementedError('attempted instantiating ' + str(type(self))) - - def tryread(self, path): - '''gracefully return an empty string for missing files''' - try: - return self.read(path) - except IOError as inst: - if inst.errno != errno.ENOENT: - raise - return "" - - def tryreadlines(self, path, mode='rb'): - '''gracefully return an empty array for missing files''' - try: - return self.readlines(path, mode=mode) - except IOError as inst: - if inst.errno != errno.ENOENT: - raise - return [] - - @util.propertycache - def open(self): - '''Open ``path`` file, which is relative to vfs root. - - Newly created directories are marked as "not to be indexed by - the content indexing service", if ``notindexed`` is specified - for "write" mode access. - ''' - return self.__call__ - - def read(self, path): - with self(path, 'rb') as fp: - return fp.read() - - def readlines(self, path, mode='rb'): - with self(path, mode=mode) as fp: - return fp.readlines() - - def write(self, path, data, backgroundclose=False): - with self(path, 'wb', backgroundclose=backgroundclose) as fp: - return fp.write(data) - - def writelines(self, path, data, mode='wb', notindexed=False): - with self(path, mode=mode, notindexed=notindexed) as fp: - return fp.writelines(data) - - def append(self, path, data): - with self(path, 'ab') as fp: - return fp.write(data) - - def basename(self, path): - """return base element of a path (as os.path.basename would do) - - This exists to allow handling of strange encoding if needed.""" - return os.path.basename(path) - - def chmod(self, path, mode): - return os.chmod(self.join(path), mode) - - def dirname(self, path): - """return dirname element of a path (as os.path.dirname would do) - - This exists to allow handling of strange encoding if needed.""" - return os.path.dirname(path) - - def exists(self, path=None): - return os.path.exists(self.join(path)) - - def fstat(self, fp): - return util.fstat(fp) - - def isdir(self, path=None): - return os.path.isdir(self.join(path)) - - def isfile(self, path=None): - return os.path.isfile(self.join(path)) - - def islink(self, path=None): - return os.path.islink(self.join(path)) - - def isfileorlink(self, path=None): - '''return whether path is a regular file or a symlink - - Unlike isfile, this doesn't follow symlinks.''' - try: - st = self.lstat(path) - except OSError: - return False - mode = st.st_mode - return stat.S_ISREG(mode) or stat.S_ISLNK(mode) - - def reljoin(self, *paths): - """join various elements of a path together (as os.path.join would do) - - The vfs base is not injected so that path stay relative. This exists - to allow handling of strange encoding if needed.""" - return os.path.join(*paths) - - def split(self, path): - """split top-most element of a path (as os.path.split would do) - - This exists to allow handling of strange encoding if needed.""" - return os.path.split(path) - - def lexists(self, path=None): - return os.path.lexists(self.join(path)) - - def lstat(self, path=None): - return os.lstat(self.join(path)) - - def listdir(self, path=None): - return os.listdir(self.join(path)) - - def makedir(self, path=None, notindexed=True): - return util.makedir(self.join(path), notindexed) - - def makedirs(self, path=None, mode=None): - return util.makedirs(self.join(path), mode) - - def makelock(self, info, path): - return util.makelock(info, self.join(path)) - - def mkdir(self, path=None): - return os.mkdir(self.join(path)) - - def mkstemp(self, suffix='', prefix='tmp', dir=None, text=False): - fd, name = tempfile.mkstemp(suffix=suffix, prefix=prefix, - dir=self.join(dir), text=text) - dname, fname = util.split(name) - if dir: - return fd, os.path.join(dir, fname) - else: - return fd, fname - - def readdir(self, path=None, stat=None, skip=None): - return osutil.listdir(self.join(path), stat, skip) - - def readlock(self, path): - return util.readlock(self.join(path)) - - def rename(self, src, dst, checkambig=False): - """Rename from src to dst - - checkambig argument is used with util.filestat, and is useful - only if destination file is guarded by any lock - (e.g. repo.lock or repo.wlock). - """ - dstpath = self.join(dst) - oldstat = checkambig and util.filestat(dstpath) - if oldstat and oldstat.stat: - ret = util.rename(self.join(src), dstpath) - newstat = util.filestat(dstpath) - if newstat.isambig(oldstat): - # stat of renamed file is ambiguous to original one - newstat.avoidambig(dstpath, oldstat) - return ret - return util.rename(self.join(src), dstpath) - - def readlink(self, path): - return os.readlink(self.join(path)) - - def removedirs(self, path=None): - """Remove a leaf directory and all empty intermediate ones - """ - return util.removedirs(self.join(path)) - - def rmtree(self, path=None, ignore_errors=False, forcibly=False): - """Remove a directory tree recursively - - If ``forcibly``, this tries to remove READ-ONLY files, too. - """ - if forcibly: - def onerror(function, path, excinfo): - if function is not os.remove: - raise - # read-only files cannot be unlinked under Windows - s = os.stat(path) - if (s.st_mode & stat.S_IWRITE) != 0: - raise - os.chmod(path, stat.S_IMODE(s.st_mode) | stat.S_IWRITE) - os.remove(path) - else: - onerror = None - return shutil.rmtree(self.join(path), - ignore_errors=ignore_errors, onerror=onerror) - - def setflags(self, path, l, x): - return util.setflags(self.join(path), l, x) - - def stat(self, path=None): - return os.stat(self.join(path)) - - def unlink(self, path=None): - return util.unlink(self.join(path)) - - def unlinkpath(self, path=None, ignoremissing=False): - return util.unlinkpath(self.join(path), ignoremissing) - - def utime(self, path=None, t=None): - return os.utime(self.join(path), t) - - def walk(self, path=None, onerror=None): - """Yield (dirpath, dirs, files) tuple for each directories under path - - ``dirpath`` is relative one from the root of this vfs. This - uses ``os.sep`` as path separator, even you specify POSIX - style ``path``. - - "The root of this vfs" is represented as empty ``dirpath``. - """ - root = os.path.normpath(self.join(None)) - # when dirpath == root, dirpath[prefixlen:] becomes empty - # because len(dirpath) < prefixlen. - prefixlen = len(pathutil.normasprefix(root)) - for dirpath, dirs, files in os.walk(self.join(path), onerror=onerror): - yield (dirpath[prefixlen:], dirs, files) - - @contextlib.contextmanager - def backgroundclosing(self, ui, expectedcount=-1): - """Allow files to be closed asynchronously. - - When this context manager is active, ``backgroundclose`` can be passed - to ``__call__``/``open`` to result in the file possibly being closed - asynchronously, on a background thread. - """ - # This is an arbitrary restriction and could be changed if we ever - # have a use case. - vfs = getattr(self, 'vfs', self) - if getattr(vfs, '_backgroundfilecloser', None): - raise error.Abort( - _('can only have 1 active background file closer')) - - with backgroundfilecloser(ui, expectedcount=expectedcount) as bfc: - try: - vfs._backgroundfilecloser = bfc - yield bfc - finally: - vfs._backgroundfilecloser = None - -class vfs(abstractvfs): - '''Operate files relative to a base directory - - This class is used to hide the details of COW semantics and - remote file access from higher level code. - ''' - def __init__(self, base, audit=True, expandpath=False, realpath=False): - if expandpath: - base = util.expandpath(base) - if realpath: - base = os.path.realpath(base) - self.base = base - self.mustaudit = audit - self.createmode = None - self._trustnlink = None - - @property - def mustaudit(self): - return self._audit - - @mustaudit.setter - def mustaudit(self, onoff): - self._audit = onoff - if onoff: - self.audit = pathutil.pathauditor(self.base) - else: - self.audit = util.always - - @util.propertycache - def _cansymlink(self): - return util.checklink(self.base) - - @util.propertycache - def _chmod(self): - return util.checkexec(self.base) - - def _fixfilemode(self, name): - if self.createmode is None or not self._chmod: - return - os.chmod(name, self.createmode & 0o666) - - def __call__(self, path, mode="r", text=False, atomictemp=False, - notindexed=False, backgroundclose=False, checkambig=False): - '''Open ``path`` file, which is relative to vfs root. - - Newly created directories are marked as "not to be indexed by - the content indexing service", if ``notindexed`` is specified - for "write" mode access. - - If ``backgroundclose`` is passed, the file may be closed asynchronously. - It can only be used if the ``self.backgroundclosing()`` context manager - is active. This should only be specified if the following criteria hold: - - 1. There is a potential for writing thousands of files. Unless you - are writing thousands of files, the performance benefits of - asynchronously closing files is not realized. - 2. Files are opened exactly once for the ``backgroundclosing`` - active duration and are therefore free of race conditions between - closing a file on a background thread and reopening it. (If the - file were opened multiple times, there could be unflushed data - because the original file handle hasn't been flushed/closed yet.) - - ``checkambig`` argument is passed to atomictemplfile (valid - only for writing), and is useful only if target file is - guarded by any lock (e.g. repo.lock or repo.wlock). - ''' - if self._audit: - r = util.checkosfilename(path) - if r: - raise error.Abort("%s: %r" % (r, path)) - self.audit(path) - f = self.join(path) - - if not text and "b" not in mode: - mode += "b" # for that other OS - - nlink = -1 - if mode not in ('r', 'rb'): - dirname, basename = util.split(f) - # If basename is empty, then the path is malformed because it points - # to a directory. Let the posixfile() call below raise IOError. - if basename: - if atomictemp: - util.makedirs(dirname, self.createmode, notindexed) - return util.atomictempfile(f, mode, self.createmode, - checkambig=checkambig) - try: - if 'w' in mode: - util.unlink(f) - nlink = 0 - else: - # nlinks() may behave differently for files on Windows - # shares if the file is open. - with util.posixfile(f): - nlink = util.nlinks(f) - if nlink < 1: - nlink = 2 # force mktempcopy (issue1922) - except (OSError, IOError) as e: - if e.errno != errno.ENOENT: - raise - nlink = 0 - util.makedirs(dirname, self.createmode, notindexed) - if nlink > 0: - if self._trustnlink is None: - self._trustnlink = nlink > 1 or util.checknlink(f) - if nlink > 1 or not self._trustnlink: - util.rename(util.mktempcopy(f), f) - fp = util.posixfile(f, mode) - if nlink == 0: - self._fixfilemode(f) - - if checkambig: - if mode in ('r', 'rb'): - raise error.Abort(_('implementation error: mode %s is not' - ' valid for checkambig=True') % mode) - fp = checkambigatclosing(fp) - - if backgroundclose: - if not self._backgroundfilecloser: - raise error.Abort(_('backgroundclose can only be used when a ' - 'backgroundclosing context manager is active') - ) - - fp = delayclosedfile(fp, self._backgroundfilecloser) - - return fp - - def symlink(self, src, dst): - self.audit(dst) - linkname = self.join(dst) - try: - os.unlink(linkname) - except OSError: - pass - - util.makedirs(os.path.dirname(linkname), self.createmode) - - if self._cansymlink: - try: - os.symlink(src, linkname) - except OSError as err: - raise OSError(err.errno, _('could not symlink to %r: %s') % - (src, err.strerror), linkname) - else: - self.write(dst, src) - - def join(self, path, *insidef): - if path: - return os.path.join(self.base, path, *insidef) - else: - return self.base - -opener = vfs - -class auditvfs(object): - def __init__(self, vfs): - self.vfs = vfs - - @property - def mustaudit(self): - return self.vfs.mustaudit - - @mustaudit.setter - def mustaudit(self, onoff): - self.vfs.mustaudit = onoff - - @property - def options(self): - return self.vfs.options - - @options.setter - def options(self, value): - self.vfs.options = value - -class filtervfs(abstractvfs, auditvfs): - '''Wrapper vfs for filtering filenames with a function.''' - - def __init__(self, vfs, filter): - auditvfs.__init__(self, vfs) - self._filter = filter - - def __call__(self, path, *args, **kwargs): - return self.vfs(self._filter(path), *args, **kwargs) - - def join(self, path, *insidef): - if path: - return self.vfs.join(self._filter(self.vfs.reljoin(path, *insidef))) - else: - return self.vfs.join(path) - -filteropener = filtervfs - -class readonlyvfs(abstractvfs, auditvfs): - '''Wrapper vfs preventing any writing.''' - - def __init__(self, vfs): - auditvfs.__init__(self, vfs) - - def __call__(self, path, mode='r', *args, **kw): - if mode not in ('r', 'rb'): - raise error.Abort(_('this vfs is read only')) - return self.vfs(path, mode, *args, **kw) - - def join(self, path, *insidef): - return self.vfs.join(path, *insidef) +# compatibility layer since all 'vfs' code moved to 'mercurial.vfs' +# +# This is hard to instal deprecation warning to this since we do not have +# access to a 'ui' object. +opener = vfs = vfsmod.vfs +filteropener = filtervfs = vfsmod.filtervfs +abstractvfs = vfsmod.abstractvfs +readonlyvfs = vfsmod.readonlyvfs +auditvfs = vfsmod.auditvfs +checkambigatclosing = vfsmod.checkambigatclosing def walkrepos(path, followsym=False, seen_dirs=None, recurse=False): '''yield every hg repository under path, always recursively. @@ -890,7 +447,7 @@ return repo[l.last()] def _pairspec(revspec): - tree = revset.parse(revspec) + tree = revsetlang.parse(revspec) return tree and tree[0] in ('range', 'rangepre', 'rangepost', 'rangeall') def revpair(repo, revs): @@ -936,7 +493,7 @@ revision numbers. It is assumed the revsets are already formatted. If you have arguments - that need to be expanded in the revset, call ``revset.formatspec()`` + that need to be expanded in the revset, call ``revsetlang.formatspec()`` and pass the result as an element of ``specs``. Specifying a single revset is allowed. @@ -947,10 +504,9 @@ allspecs = [] for spec in specs: if isinstance(spec, int): - spec = revset.formatspec('rev(%d)', spec) + spec = revsetlang.formatspec('rev(%d)', spec) allspecs.append(spec) - m = revset.matchany(repo.ui, allspecs, repo) - return m(repo) + return repo.anyrevs(allspecs, user=True) def meaningfulparents(repo, ctx): """Return list of meaningful (or all if debug) parentrevs for rev. @@ -1325,11 +881,11 @@ function to call the appropriate join function on 'obj' (an instance of the class that its member function was decorated). """ - return obj.join(fname) + raise NotImplementedError def __call__(self, func): self.func = func - self.name = func.__name__ + self.name = func.__name__.encode('ascii') return self def __get__(self, obj, type=None): @@ -1410,164 +966,40 @@ # experimental config: format.generaldelta return ui.configbool('format', 'generaldelta', False) -class closewrapbase(object): - """Base class of wrapper, which hooks closing - - Do not instantiate outside of the vfs layer. - """ - def __init__(self, fh): - object.__setattr__(self, '_origfh', fh) - - def __getattr__(self, attr): - return getattr(self._origfh, attr) - - def __setattr__(self, attr, value): - return setattr(self._origfh, attr, value) - - def __delattr__(self, attr): - return delattr(self._origfh, attr) +class simplekeyvaluefile(object): + """A simple file with key=value lines - def __enter__(self): - return self._origfh.__enter__() - - def __exit__(self, exc_type, exc_value, exc_tb): - raise NotImplementedError('attempted instantiating ' + str(type(self))) - - def close(self): - raise NotImplementedError('attempted instantiating ' + str(type(self))) - -class delayclosedfile(closewrapbase): - """Proxy for a file object whose close is delayed. - - Do not instantiate outside of the vfs layer. - """ - def __init__(self, fh, closer): - super(delayclosedfile, self).__init__(fh) - object.__setattr__(self, '_closer', closer) - - def __exit__(self, exc_type, exc_value, exc_tb): - self._closer.close(self._origfh) + Keys must be alphanumerics and start with a letter, values must not + contain '\n' characters""" - def close(self): - self._closer.close(self._origfh) - -class backgroundfilecloser(object): - """Coordinates background closing of file handles on multiple threads.""" - def __init__(self, ui, expectedcount=-1): - self._running = False - self._entered = False - self._threads = [] - self._threadexception = None - - # Only Windows/NTFS has slow file closing. So only enable by default - # on that platform. But allow to be enabled elsewhere for testing. - defaultenabled = pycompat.osname == 'nt' - enabled = ui.configbool('worker', 'backgroundclose', defaultenabled) - - if not enabled: - return + def __init__(self, vfs, path, keys=None): + self.vfs = vfs + self.path = path - # There is overhead to starting and stopping the background threads. - # Don't do background processing unless the file count is large enough - # to justify it. - minfilecount = ui.configint('worker', 'backgroundcloseminfilecount', - 2048) - # FUTURE dynamically start background threads after minfilecount closes. - # (We don't currently have any callers that don't know their file count) - if expectedcount > 0 and expectedcount < minfilecount: - return - - # Windows defaults to a limit of 512 open files. A buffer of 128 - # should give us enough headway. - maxqueue = ui.configint('worker', 'backgroundclosemaxqueue', 384) - threadcount = ui.configint('worker', 'backgroundclosethreadcount', 4) - - ui.debug('starting %d threads for background file closing\n' % - threadcount) - - self._queue = util.queue(maxsize=maxqueue) - self._running = True + def read(self): + lines = self.vfs.readlines(self.path) + try: + d = dict(line[:-1].split('=', 1) for line in lines if line) + except ValueError as e: + raise error.CorruptedState(str(e)) + return d - for i in range(threadcount): - t = threading.Thread(target=self._worker, name='backgroundcloser') - self._threads.append(t) - t.start() - - def __enter__(self): - self._entered = True - return self - - def __exit__(self, exc_type, exc_value, exc_tb): - self._running = False - - # Wait for threads to finish closing so open files don't linger for - # longer than lifetime of context manager. - for t in self._threads: - t.join() - - def _worker(self): - """Main routine for worker thread.""" - while True: - try: - fh = self._queue.get(block=True, timeout=0.100) - # Need to catch or the thread will terminate and - # we could orphan file descriptors. - try: - fh.close() - except Exception as e: - # Stash so can re-raise from main thread later. - self._threadexception = e - except util.empty: - if not self._running: - break - - def close(self, fh): - """Schedule a file for closing.""" - if not self._entered: - raise error.Abort(_('can only call close() when context manager ' - 'active')) + def write(self, data): + """Write key=>value mapping to a file + data is a dict. Keys must be alphanumerical and start with a letter. + Values must not contain newline characters.""" + lines = [] + for k, v in data.items(): + if not k[0].isalpha(): + e = "keys must start with a letter in a key-value file" + raise error.ProgrammingError(e) + if not k.isalnum(): + e = "invalid key name in a simple key-value file" + raise error.ProgrammingError(e) + if '\n' in v: + e = "invalid value in a simple key-value file" + raise error.ProgrammingError(e) + lines.append("%s=%s\n" % (k, v)) + with self.vfs(self.path, mode='wb', atomictemp=True) as fp: + fp.write(''.join(lines)) - # If a background thread encountered an exception, raise now so we fail - # fast. Otherwise we may potentially go on for minutes until the error - # is acted on. - if self._threadexception: - e = self._threadexception - self._threadexception = None - raise e - - # If we're not actively running, close synchronously. - if not self._running: - fh.close() - return - - self._queue.put(fh, block=True, timeout=None) - -class checkambigatclosing(closewrapbase): - """Proxy for a file object, to avoid ambiguity of file stat - - See also util.filestat for detail about "ambiguity of file stat". - - This proxy is useful only if the target file is guarded by any - lock (e.g. repo.lock or repo.wlock) - - Do not instantiate outside of the vfs layer. - """ - def __init__(self, fh): - super(checkambigatclosing, self).__init__(fh) - object.__setattr__(self, '_oldstat', util.filestat(fh.name)) - - def _checkambig(self): - oldstat = self._oldstat - if oldstat.stat: - newstat = util.filestat(self._origfh.name) - if newstat.isambig(oldstat): - # stat of changed file is ambiguous to original one - newstat.avoidambig(self._origfh.name, oldstat) - - def __exit__(self, exc_type, exc_value, exc_tb): - self._origfh.__exit__(exc_type, exc_value, exc_tb) - self._checkambig() - - def close(self): - self._origfh.close() - self._checkambig() diff -r ed5b25874d99 -r 4baf79a77afa mercurial/server.py --- a/mercurial/server.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/server.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,7 +7,6 @@ from __future__ import absolute_import -import errno import os import sys import tempfile @@ -60,11 +59,7 @@ raise error.Abort(_('child process failed to start')) writepid(pid) finally: - try: - os.unlink(lockpath) - except OSError as e: - if e.errno != errno.ENOENT: - raise + util.tryunlink(lockpath) if parentfn: return parentfn(pid) else: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/similar.py --- a/mercurial/similar.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/similar.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,8 +7,6 @@ from __future__ import absolute_import -import hashlib - from .i18n import _ from . import ( bdiff, @@ -23,21 +21,29 @@ ''' numfiles = len(added) + len(removed) - # Get hashes of removed files. + # Build table of removed files: {hash(fctx.data()): [fctx, ...]}. + # We use hash() to discard fctx.data() from memory. hashes = {} for i, fctx in enumerate(removed): repo.ui.progress(_('searching for exact renames'), i, total=numfiles, unit=_('files')) - h = hashlib.sha1(fctx.data()).digest() - hashes[h] = fctx + h = hash(fctx.data()) + if h not in hashes: + hashes[h] = [fctx] + else: + hashes[h].append(fctx) # For each added file, see if it corresponds to a removed file. for i, fctx in enumerate(added): repo.ui.progress(_('searching for exact renames'), i + len(removed), total=numfiles, unit=_('files')) - h = hashlib.sha1(fctx.data()).digest() - if h in hashes: - yield (hashes[h], fctx) + adata = fctx.data() + h = hash(adata) + for rfctx in hashes.get(h, []): + # compare between actual file contents for exact identity + if adata == rfctx.data(): + yield (rfctx, fctx) + break # Done repo.ui.progress(_('searching for exact renames'), None) @@ -81,7 +87,7 @@ if data is None: data = _ctxdata(r) myscore = _score(a, data) - if myscore >= bestscore: + if myscore > bestscore: copies[a] = (r, myscore) repo.ui.progress(_('searching'), None) @@ -89,27 +95,29 @@ source, bscore = v yield source, dest, bscore +def _dropempty(fctxs): + return [x for x in fctxs if x.size() > 0] + def findrenames(repo, added, removed, threshold): '''find renamed files -- yields (before, after, score) tuples''' - parentctx = repo['.'] - workingctx = repo[None] + wctx = repo[None] + pctx = wctx.p1() # Zero length files will be frequently unrelated to each other, and # tracking the deletion/addition of such a file will probably cause more # harm than good. We strip them out here to avoid matching them later on. - addedfiles = set([workingctx[fp] for fp in added - if workingctx[fp].size() > 0]) - removedfiles = set([parentctx[fp] for fp in removed - if fp in parentctx and parentctx[fp].size() > 0]) + addedfiles = _dropempty(wctx[fp] for fp in sorted(added)) + removedfiles = _dropempty(pctx[fp] for fp in sorted(removed) if fp in pctx) # Find exact matches. - for (a, b) in _findexactmatches(repo, - sorted(addedfiles), sorted(removedfiles)): - addedfiles.remove(b) + matchedfiles = set() + for (a, b) in _findexactmatches(repo, addedfiles, removedfiles): + matchedfiles.add(b) yield (a.path(), b.path(), 1.0) # If the user requested similar files to be matched, search for them also. if threshold < 1.0: - for (a, b, score) in _findsimilarmatches(repo, - sorted(addedfiles), sorted(removedfiles), threshold): + addedfiles = [x for x in addedfiles if x not in matchedfiles] + for (a, b, score) in _findsimilarmatches(repo, addedfiles, + removedfiles, threshold): yield (a.path(), b.path(), score) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/simplemerge.py --- a/mercurial/simplemerge.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/simplemerge.py Fri Mar 24 08:37:26 2017 -0700 @@ -24,8 +24,8 @@ from . import ( error, mdiff, - scmutil, util, + vfs as vfsmod, ) class CantReprocessAndShowBase(Exception): @@ -437,7 +437,7 @@ local = os.path.realpath(local) if not opts.get('print'): - opener = scmutil.opener(os.path.dirname(local)) + opener = vfsmod.vfs(os.path.dirname(local)) out = opener(os.path.basename(local), "w", atomictemp=True) else: out = ui.fout diff -r ed5b25874d99 -r 4baf79a77afa mercurial/smartset.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/smartset.py Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,1066 @@ +# smartset.py - data structure for revision set +# +# Copyright 2010 Matt Mackall +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. + +from __future__ import absolute_import + +from . import ( + util, +) + +def _formatsetrepr(r): + """Format an optional printable representation of a set + + ======== ================================= + type(r) example + ======== ================================= + tuple ('', other) + str '' + callable lambda: '' % sorted(b) + object other + ======== ================================= + """ + if r is None: + return '' + elif isinstance(r, tuple): + return r[0] % r[1:] + elif isinstance(r, str): + return r + elif callable(r): + return r() + else: + return repr(r) + +class abstractsmartset(object): + + def __nonzero__(self): + """True if the smartset is not empty""" + raise NotImplementedError() + + __bool__ = __nonzero__ + + def __contains__(self, rev): + """provide fast membership testing""" + raise NotImplementedError() + + def __iter__(self): + """iterate the set in the order it is supposed to be iterated""" + raise NotImplementedError() + + # Attributes containing a function to perform a fast iteration in a given + # direction. A smartset can have none, one, or both defined. + # + # Default value is None instead of a function returning None to avoid + # initializing an iterator just for testing if a fast method exists. + fastasc = None + fastdesc = None + + def isascending(self): + """True if the set will iterate in ascending order""" + raise NotImplementedError() + + def isdescending(self): + """True if the set will iterate in descending order""" + raise NotImplementedError() + + def istopo(self): + """True if the set will iterate in topographical order""" + raise NotImplementedError() + + def min(self): + """return the minimum element in the set""" + if self.fastasc is None: + v = min(self) + else: + for v in self.fastasc(): + break + else: + raise ValueError('arg is an empty sequence') + self.min = lambda: v + return v + + def max(self): + """return the maximum element in the set""" + if self.fastdesc is None: + return max(self) + else: + for v in self.fastdesc(): + break + else: + raise ValueError('arg is an empty sequence') + self.max = lambda: v + return v + + def first(self): + """return the first element in the set (user iteration perspective) + + Return None if the set is empty""" + raise NotImplementedError() + + def last(self): + """return the last element in the set (user iteration perspective) + + Return None if the set is empty""" + raise NotImplementedError() + + def __len__(self): + """return the length of the smartsets + + This can be expensive on smartset that could be lazy otherwise.""" + raise NotImplementedError() + + def reverse(self): + """reverse the expected iteration order""" + raise NotImplementedError() + + def sort(self, reverse=True): + """get the set to iterate in an ascending or descending order""" + raise NotImplementedError() + + def __and__(self, other): + """Returns a new object with the intersection of the two collections. + + This is part of the mandatory API for smartset.""" + if isinstance(other, fullreposet): + return self + return self.filter(other.__contains__, condrepr=other, cache=False) + + def __add__(self, other): + """Returns a new object with the union of the two collections. + + This is part of the mandatory API for smartset.""" + return addset(self, other) + + def __sub__(self, other): + """Returns a new object with the substraction of the two collections. + + This is part of the mandatory API for smartset.""" + c = other.__contains__ + return self.filter(lambda r: not c(r), condrepr=('', other), + cache=False) + + def filter(self, condition, condrepr=None, cache=True): + """Returns this smartset filtered by condition as a new smartset. + + `condition` is a callable which takes a revision number and returns a + boolean. Optional `condrepr` provides a printable representation of + the given `condition`. + + This is part of the mandatory API for smartset.""" + # builtin cannot be cached. but do not needs to + if cache and util.safehasattr(condition, 'func_code'): + condition = util.cachefunc(condition) + return filteredset(self, condition, condrepr) + +class baseset(abstractsmartset): + """Basic data structure that represents a revset and contains the basic + operation that it should be able to perform. + + Every method in this class should be implemented by any smartset class. + + This class could be constructed by an (unordered) set, or an (ordered) + list-like object. If a set is provided, it'll be sorted lazily. + + >>> x = [4, 0, 7, 6] + >>> y = [5, 6, 7, 3] + + Construct by a set: + >>> xs = baseset(set(x)) + >>> ys = baseset(set(y)) + >>> [list(i) for i in [xs + ys, xs & ys, xs - ys]] + [[0, 4, 6, 7, 3, 5], [6, 7], [0, 4]] + >>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]] + ['addset', 'baseset', 'baseset'] + + Construct by a list-like: + >>> xs = baseset(x) + >>> ys = baseset(i for i in y) + >>> [list(i) for i in [xs + ys, xs & ys, xs - ys]] + [[4, 0, 7, 6, 5, 3], [7, 6], [4, 0]] + >>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]] + ['addset', 'filteredset', 'filteredset'] + + Populate "_set" fields in the lists so set optimization may be used: + >>> [1 in xs, 3 in ys] + [False, True] + + Without sort(), results won't be changed: + >>> [list(i) for i in [xs + ys, xs & ys, xs - ys]] + [[4, 0, 7, 6, 5, 3], [7, 6], [4, 0]] + >>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]] + ['addset', 'filteredset', 'filteredset'] + + With sort(), set optimization could be used: + >>> xs.sort(reverse=True) + >>> [list(i) for i in [xs + ys, xs & ys, xs - ys]] + [[7, 6, 4, 0, 5, 3], [7, 6], [4, 0]] + >>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]] + ['addset', 'baseset', 'baseset'] + + >>> ys.sort() + >>> [list(i) for i in [xs + ys, xs & ys, xs - ys]] + [[7, 6, 4, 0, 3, 5], [7, 6], [4, 0]] + >>> [type(i).__name__ for i in [xs + ys, xs & ys, xs - ys]] + ['addset', 'baseset', 'baseset'] + + istopo is preserved across set operations + >>> xs = baseset(set(x), istopo=True) + >>> rs = xs & ys + >>> type(rs).__name__ + 'baseset' + >>> rs._istopo + True + """ + def __init__(self, data=(), datarepr=None, istopo=False): + """ + datarepr: a tuple of (format, obj, ...), a function or an object that + provides a printable representation of the given data. + """ + self._ascending = None + self._istopo = istopo + if isinstance(data, set): + # converting set to list has a cost, do it lazily + self._set = data + # set has no order we pick one for stability purpose + self._ascending = True + else: + if not isinstance(data, list): + data = list(data) + self._list = data + self._datarepr = datarepr + + @util.propertycache + def _set(self): + return set(self._list) + + @util.propertycache + def _asclist(self): + asclist = self._list[:] + asclist.sort() + return asclist + + @util.propertycache + def _list(self): + # _list is only lazily constructed if we have _set + assert '_set' in self.__dict__ + return list(self._set) + + def __iter__(self): + if self._ascending is None: + return iter(self._list) + elif self._ascending: + return iter(self._asclist) + else: + return reversed(self._asclist) + + def fastasc(self): + return iter(self._asclist) + + def fastdesc(self): + return reversed(self._asclist) + + @util.propertycache + def __contains__(self): + return self._set.__contains__ + + def __nonzero__(self): + return bool(len(self)) + + __bool__ = __nonzero__ + + def sort(self, reverse=False): + self._ascending = not bool(reverse) + self._istopo = False + + def reverse(self): + if self._ascending is None: + self._list.reverse() + else: + self._ascending = not self._ascending + self._istopo = False + + def __len__(self): + if '_list' in self.__dict__: + return len(self._list) + else: + return len(self._set) + + def isascending(self): + """Returns True if the collection is ascending order, False if not. + + This is part of the mandatory API for smartset.""" + if len(self) <= 1: + return True + return self._ascending is not None and self._ascending + + def isdescending(self): + """Returns True if the collection is descending order, False if not. + + This is part of the mandatory API for smartset.""" + if len(self) <= 1: + return True + return self._ascending is not None and not self._ascending + + def istopo(self): + """Is the collection is in topographical order or not. + + This is part of the mandatory API for smartset.""" + if len(self) <= 1: + return True + return self._istopo + + def first(self): + if self: + if self._ascending is None: + return self._list[0] + elif self._ascending: + return self._asclist[0] + else: + return self._asclist[-1] + return None + + def last(self): + if self: + if self._ascending is None: + return self._list[-1] + elif self._ascending: + return self._asclist[-1] + else: + return self._asclist[0] + return None + + def _fastsetop(self, other, op): + # try to use native set operations as fast paths + if (type(other) is baseset and '_set' in other.__dict__ and '_set' in + self.__dict__ and self._ascending is not None): + s = baseset(data=getattr(self._set, op)(other._set), + istopo=self._istopo) + s._ascending = self._ascending + else: + s = getattr(super(baseset, self), op)(other) + return s + + def __and__(self, other): + return self._fastsetop(other, '__and__') + + def __sub__(self, other): + return self._fastsetop(other, '__sub__') + + def __repr__(self): + d = {None: '', False: '-', True: '+'}[self._ascending] + s = _formatsetrepr(self._datarepr) + if not s: + l = self._list + # if _list has been built from a set, it might have a different + # order from one python implementation to another. + # We fallback to the sorted version for a stable output. + if self._ascending is not None: + l = self._asclist + s = repr(l) + return '<%s%s %s>' % (type(self).__name__, d, s) + +class filteredset(abstractsmartset): + """Duck type for baseset class which iterates lazily over the revisions in + the subset and contains a function which tests for membership in the + revset + """ + def __init__(self, subset, condition=lambda x: True, condrepr=None): + """ + condition: a function that decide whether a revision in the subset + belongs to the revset or not. + condrepr: a tuple of (format, obj, ...), a function or an object that + provides a printable representation of the given condition. + """ + self._subset = subset + self._condition = condition + self._condrepr = condrepr + + def __contains__(self, x): + return x in self._subset and self._condition(x) + + def __iter__(self): + return self._iterfilter(self._subset) + + def _iterfilter(self, it): + cond = self._condition + for x in it: + if cond(x): + yield x + + @property + def fastasc(self): + it = self._subset.fastasc + if it is None: + return None + return lambda: self._iterfilter(it()) + + @property + def fastdesc(self): + it = self._subset.fastdesc + if it is None: + return None + return lambda: self._iterfilter(it()) + + def __nonzero__(self): + fast = None + candidates = [self.fastasc if self.isascending() else None, + self.fastdesc if self.isdescending() else None, + self.fastasc, + self.fastdesc] + for candidate in candidates: + if candidate is not None: + fast = candidate + break + + if fast is not None: + it = fast() + else: + it = self + + for r in it: + return True + return False + + __bool__ = __nonzero__ + + def __len__(self): + # Basic implementation to be changed in future patches. + # until this gets improved, we use generator expression + # here, since list comprehensions are free to call __len__ again + # causing infinite recursion + l = baseset(r for r in self) + return len(l) + + def sort(self, reverse=False): + self._subset.sort(reverse=reverse) + + def reverse(self): + self._subset.reverse() + + def isascending(self): + return self._subset.isascending() + + def isdescending(self): + return self._subset.isdescending() + + def istopo(self): + return self._subset.istopo() + + def first(self): + for x in self: + return x + return None + + def last(self): + it = None + if self.isascending(): + it = self.fastdesc + elif self.isdescending(): + it = self.fastasc + if it is not None: + for x in it(): + return x + return None #empty case + else: + x = None + for x in self: + pass + return x + + def __repr__(self): + xs = [repr(self._subset)] + s = _formatsetrepr(self._condrepr) + if s: + xs.append(s) + return '<%s %s>' % (type(self).__name__, ', '.join(xs)) + +def _iterordered(ascending, iter1, iter2): + """produce an ordered iteration from two iterators with the same order + + The ascending is used to indicated the iteration direction. + """ + choice = max + if ascending: + choice = min + + val1 = None + val2 = None + try: + # Consume both iterators in an ordered way until one is empty + while True: + if val1 is None: + val1 = next(iter1) + if val2 is None: + val2 = next(iter2) + n = choice(val1, val2) + yield n + if val1 == n: + val1 = None + if val2 == n: + val2 = None + except StopIteration: + # Flush any remaining values and consume the other one + it = iter2 + if val1 is not None: + yield val1 + it = iter1 + elif val2 is not None: + # might have been equality and both are empty + yield val2 + for val in it: + yield val + +class addset(abstractsmartset): + """Represent the addition of two sets + + Wrapper structure for lazily adding two structures without losing much + performance on the __contains__ method + + If the ascending attribute is set, that means the two structures are + ordered in either an ascending or descending way. Therefore, we can add + them maintaining the order by iterating over both at the same time + + >>> xs = baseset([0, 3, 2]) + >>> ys = baseset([5, 2, 4]) + + >>> rs = addset(xs, ys) + >>> bool(rs), 0 in rs, 1 in rs, 5 in rs, rs.first(), rs.last() + (True, True, False, True, 0, 4) + >>> rs = addset(xs, baseset([])) + >>> bool(rs), 0 in rs, 1 in rs, rs.first(), rs.last() + (True, True, False, 0, 2) + >>> rs = addset(baseset([]), baseset([])) + >>> bool(rs), 0 in rs, rs.first(), rs.last() + (False, False, None, None) + + iterate unsorted: + >>> rs = addset(xs, ys) + >>> # (use generator because pypy could call len()) + >>> list(x for x in rs) # without _genlist + [0, 3, 2, 5, 4] + >>> assert not rs._genlist + >>> len(rs) + 5 + >>> [x for x in rs] # with _genlist + [0, 3, 2, 5, 4] + >>> assert rs._genlist + + iterate ascending: + >>> rs = addset(xs, ys, ascending=True) + >>> # (use generator because pypy could call len()) + >>> list(x for x in rs), list(x for x in rs.fastasc()) # without _asclist + ([0, 2, 3, 4, 5], [0, 2, 3, 4, 5]) + >>> assert not rs._asclist + >>> len(rs) + 5 + >>> [x for x in rs], [x for x in rs.fastasc()] + ([0, 2, 3, 4, 5], [0, 2, 3, 4, 5]) + >>> assert rs._asclist + + iterate descending: + >>> rs = addset(xs, ys, ascending=False) + >>> # (use generator because pypy could call len()) + >>> list(x for x in rs), list(x for x in rs.fastdesc()) # without _asclist + ([5, 4, 3, 2, 0], [5, 4, 3, 2, 0]) + >>> assert not rs._asclist + >>> len(rs) + 5 + >>> [x for x in rs], [x for x in rs.fastdesc()] + ([5, 4, 3, 2, 0], [5, 4, 3, 2, 0]) + >>> assert rs._asclist + + iterate ascending without fastasc: + >>> rs = addset(xs, generatorset(ys), ascending=True) + >>> assert rs.fastasc is None + >>> [x for x in rs] + [0, 2, 3, 4, 5] + + iterate descending without fastdesc: + >>> rs = addset(generatorset(xs), ys, ascending=False) + >>> assert rs.fastdesc is None + >>> [x for x in rs] + [5, 4, 3, 2, 0] + """ + def __init__(self, revs1, revs2, ascending=None): + self._r1 = revs1 + self._r2 = revs2 + self._iter = None + self._ascending = ascending + self._genlist = None + self._asclist = None + + def __len__(self): + return len(self._list) + + def __nonzero__(self): + return bool(self._r1) or bool(self._r2) + + __bool__ = __nonzero__ + + @util.propertycache + def _list(self): + if not self._genlist: + self._genlist = baseset(iter(self)) + return self._genlist + + def __iter__(self): + """Iterate over both collections without repeating elements + + If the ascending attribute is not set, iterate over the first one and + then over the second one checking for membership on the first one so we + dont yield any duplicates. + + If the ascending attribute is set, iterate over both collections at the + same time, yielding only one value at a time in the given order. + """ + if self._ascending is None: + if self._genlist: + return iter(self._genlist) + def arbitraryordergen(): + for r in self._r1: + yield r + inr1 = self._r1.__contains__ + for r in self._r2: + if not inr1(r): + yield r + return arbitraryordergen() + # try to use our own fast iterator if it exists + self._trysetasclist() + if self._ascending: + attr = 'fastasc' + else: + attr = 'fastdesc' + it = getattr(self, attr) + if it is not None: + return it() + # maybe half of the component supports fast + # get iterator for _r1 + iter1 = getattr(self._r1, attr) + if iter1 is None: + # let's avoid side effect (not sure it matters) + iter1 = iter(sorted(self._r1, reverse=not self._ascending)) + else: + iter1 = iter1() + # get iterator for _r2 + iter2 = getattr(self._r2, attr) + if iter2 is None: + # let's avoid side effect (not sure it matters) + iter2 = iter(sorted(self._r2, reverse=not self._ascending)) + else: + iter2 = iter2() + return _iterordered(self._ascending, iter1, iter2) + + def _trysetasclist(self): + """populate the _asclist attribute if possible and necessary""" + if self._genlist is not None and self._asclist is None: + self._asclist = sorted(self._genlist) + + @property + def fastasc(self): + self._trysetasclist() + if self._asclist is not None: + return self._asclist.__iter__ + iter1 = self._r1.fastasc + iter2 = self._r2.fastasc + if None in (iter1, iter2): + return None + return lambda: _iterordered(True, iter1(), iter2()) + + @property + def fastdesc(self): + self._trysetasclist() + if self._asclist is not None: + return self._asclist.__reversed__ + iter1 = self._r1.fastdesc + iter2 = self._r2.fastdesc + if None in (iter1, iter2): + return None + return lambda: _iterordered(False, iter1(), iter2()) + + def __contains__(self, x): + return x in self._r1 or x in self._r2 + + def sort(self, reverse=False): + """Sort the added set + + For this we use the cached list with all the generated values and if we + know they are ascending or descending we can sort them in a smart way. + """ + self._ascending = not reverse + + def isascending(self): + return self._ascending is not None and self._ascending + + def isdescending(self): + return self._ascending is not None and not self._ascending + + def istopo(self): + # not worth the trouble asserting if the two sets combined are still + # in topographical order. Use the sort() predicate to explicitly sort + # again instead. + return False + + def reverse(self): + if self._ascending is None: + self._list.reverse() + else: + self._ascending = not self._ascending + + def first(self): + for x in self: + return x + return None + + def last(self): + self.reverse() + val = self.first() + self.reverse() + return val + + def __repr__(self): + d = {None: '', False: '-', True: '+'}[self._ascending] + return '<%s%s %r, %r>' % (type(self).__name__, d, self._r1, self._r2) + +class generatorset(abstractsmartset): + """Wrap a generator for lazy iteration + + Wrapper structure for generators that provides lazy membership and can + be iterated more than once. + When asked for membership it generates values until either it finds the + requested one or has gone through all the elements in the generator + """ + def __init__(self, gen, iterasc=None): + """ + gen: a generator producing the values for the generatorset. + """ + self._gen = gen + self._asclist = None + self._cache = {} + self._genlist = [] + self._finished = False + self._ascending = True + if iterasc is not None: + if iterasc: + self.fastasc = self._iterator + self.__contains__ = self._asccontains + else: + self.fastdesc = self._iterator + self.__contains__ = self._desccontains + + def __nonzero__(self): + # Do not use 'for r in self' because it will enforce the iteration + # order (default ascending), possibly unrolling a whole descending + # iterator. + if self._genlist: + return True + for r in self._consumegen(): + return True + return False + + __bool__ = __nonzero__ + + def __contains__(self, x): + if x in self._cache: + return self._cache[x] + + # Use new values only, as existing values would be cached. + for l in self._consumegen(): + if l == x: + return True + + self._cache[x] = False + return False + + def _asccontains(self, x): + """version of contains optimised for ascending generator""" + if x in self._cache: + return self._cache[x] + + # Use new values only, as existing values would be cached. + for l in self._consumegen(): + if l == x: + return True + if l > x: + break + + self._cache[x] = False + return False + + def _desccontains(self, x): + """version of contains optimised for descending generator""" + if x in self._cache: + return self._cache[x] + + # Use new values only, as existing values would be cached. + for l in self._consumegen(): + if l == x: + return True + if l < x: + break + + self._cache[x] = False + return False + + def __iter__(self): + if self._ascending: + it = self.fastasc + else: + it = self.fastdesc + if it is not None: + return it() + # we need to consume the iterator + for x in self._consumegen(): + pass + # recall the same code + return iter(self) + + def _iterator(self): + if self._finished: + return iter(self._genlist) + + # We have to use this complex iteration strategy to allow multiple + # iterations at the same time. We need to be able to catch revision + # removed from _consumegen and added to genlist in another instance. + # + # Getting rid of it would provide an about 15% speed up on this + # iteration. + genlist = self._genlist + nextgen = self._consumegen() + _len, _next = len, next # cache global lookup + def gen(): + i = 0 + while True: + if i < _len(genlist): + yield genlist[i] + else: + yield _next(nextgen) + i += 1 + return gen() + + def _consumegen(self): + cache = self._cache + genlist = self._genlist.append + for item in self._gen: + cache[item] = True + genlist(item) + yield item + if not self._finished: + self._finished = True + asc = self._genlist[:] + asc.sort() + self._asclist = asc + self.fastasc = asc.__iter__ + self.fastdesc = asc.__reversed__ + + def __len__(self): + for x in self._consumegen(): + pass + return len(self._genlist) + + def sort(self, reverse=False): + self._ascending = not reverse + + def reverse(self): + self._ascending = not self._ascending + + def isascending(self): + return self._ascending + + def isdescending(self): + return not self._ascending + + def istopo(self): + # not worth the trouble asserting if the two sets combined are still + # in topographical order. Use the sort() predicate to explicitly sort + # again instead. + return False + + def first(self): + if self._ascending: + it = self.fastasc + else: + it = self.fastdesc + if it is None: + # we need to consume all and try again + for x in self._consumegen(): + pass + return self.first() + return next(it(), None) + + def last(self): + if self._ascending: + it = self.fastdesc + else: + it = self.fastasc + if it is None: + # we need to consume all and try again + for x in self._consumegen(): + pass + return self.first() + return next(it(), None) + + def __repr__(self): + d = {False: '-', True: '+'}[self._ascending] + return '<%s%s>' % (type(self).__name__, d) + +class spanset(abstractsmartset): + """Duck type for baseset class which represents a range of revisions and + can work lazily and without having all the range in memory + + Note that spanset(x, y) behave almost like xrange(x, y) except for two + notable points: + - when x < y it will be automatically descending, + - revision filtered with this repoview will be skipped. + + """ + def __init__(self, repo, start=0, end=None): + """ + start: first revision included the set + (default to 0) + end: first revision excluded (last+1) + (default to len(repo) + + Spanset will be descending if `end` < `start`. + """ + if end is None: + end = len(repo) + self._ascending = start <= end + if not self._ascending: + start, end = end + 1, start +1 + self._start = start + self._end = end + self._hiddenrevs = repo.changelog.filteredrevs + + def sort(self, reverse=False): + self._ascending = not reverse + + def reverse(self): + self._ascending = not self._ascending + + def istopo(self): + # not worth the trouble asserting if the two sets combined are still + # in topographical order. Use the sort() predicate to explicitly sort + # again instead. + return False + + def _iterfilter(self, iterrange): + s = self._hiddenrevs + for r in iterrange: + if r not in s: + yield r + + def __iter__(self): + if self._ascending: + return self.fastasc() + else: + return self.fastdesc() + + def fastasc(self): + iterrange = xrange(self._start, self._end) + if self._hiddenrevs: + return self._iterfilter(iterrange) + return iter(iterrange) + + def fastdesc(self): + iterrange = xrange(self._end - 1, self._start - 1, -1) + if self._hiddenrevs: + return self._iterfilter(iterrange) + return iter(iterrange) + + def __contains__(self, rev): + hidden = self._hiddenrevs + return ((self._start <= rev < self._end) + and not (hidden and rev in hidden)) + + def __nonzero__(self): + for r in self: + return True + return False + + __bool__ = __nonzero__ + + def __len__(self): + if not self._hiddenrevs: + return abs(self._end - self._start) + else: + count = 0 + start = self._start + end = self._end + for rev in self._hiddenrevs: + if (end < rev <= start) or (start <= rev < end): + count += 1 + return abs(self._end - self._start) - count + + def isascending(self): + return self._ascending + + def isdescending(self): + return not self._ascending + + def first(self): + if self._ascending: + it = self.fastasc + else: + it = self.fastdesc + for x in it(): + return x + return None + + def last(self): + if self._ascending: + it = self.fastdesc + else: + it = self.fastasc + for x in it(): + return x + return None + + def __repr__(self): + d = {False: '-', True: '+'}[self._ascending] + return '<%s%s %d:%d>' % (type(self).__name__, d, + self._start, self._end - 1) + +class fullreposet(spanset): + """a set containing all revisions in the repo + + This class exists to host special optimization and magic to handle virtual + revisions such as "null". + """ + + def __init__(self, repo): + super(fullreposet, self).__init__(repo) + + def __and__(self, other): + """As self contains the whole repo, all of the other set should also be + in self. Therefore `self & other = other`. + + This boldly assumes the other contains valid revs only. + """ + # other not a smartset, make is so + if not util.safehasattr(other, 'isascending'): + # filter out hidden revision + # (this boldly assumes all smartset are pure) + # + # `other` was used with "&", let's assume this is a set like + # object. + other = baseset(other - self._hiddenrevs) + + other.sort(reverse=self.isdescending()) + return other + +def prettyformat(revs): + lines = [] + rs = repr(revs) + p = 0 + while p < len(rs): + q = rs.find('<', p + 1) + if q < 0: + q = len(rs) + l = rs.count('<', 0, p) - rs.count('>', 0, p) + assert l >= 0 + lines.append((l, rs[p:q].rstrip())) + p = q + return '\n'.join(' ' * l + s for l, s in lines) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/sshpeer.py --- a/mercurial/sshpeer.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/sshpeer.py Fri Mar 24 08:37:26 2017 -0700 @@ -150,7 +150,7 @@ util.shellquote("%s init %s" % (_serverquote(remotecmd), _serverquote(self.path)))) ui.debug('running %s\n' % cmd) - res = ui.system(cmd) + res = ui.system(cmd, blockedtag='sshpeer') if res != 0: self._abort(error.RepoError(_("could not create remote repo"))) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/sslutil.py --- a/mercurial/sslutil.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/sslutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -720,7 +720,8 @@ # to load the system CA store. If we're running on Apple Python, use this # trick. if _plainapplepython(): - dummycert = os.path.join(os.path.dirname(__file__), 'dummycert.pem') + dummycert = os.path.join( + os.path.dirname(pycompat.fsencode(__file__)), 'dummycert.pem') if os.path.exists(dummycert): return dummycert @@ -814,6 +815,16 @@ if peerfingerprints[hash].lower() == fingerprint: ui.debug('%s certificate matched fingerprint %s:%s\n' % (host, hash, fmtfingerprint(fingerprint))) + if settings['legacyfingerprint']: + ui.warn(_('(SHA-1 fingerprint for %s found in legacy ' + '[hostfingerprints] section; ' + 'if you trust this fingerprint, set the ' + 'following config value in [hostsecurity] and ' + 'remove the old one from [hostfingerprints] ' + 'to upgrade to a more secure SHA-256 ' + 'fingerprint: ' + '%s.fingerprints=%s)\n') % ( + host, host, nicefingerprint)) return # Pinned fingerprint didn't match. This is a fatal error. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/statichttprepo.py --- a/mercurial/statichttprepo.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/statichttprepo.py Fri Mar 24 08:37:26 2017 -0700 @@ -24,6 +24,7 @@ store, url, util, + vfs as vfsmod, ) urlerr = util.urlerr @@ -86,7 +87,7 @@ urlopener = url.opener(ui, authinfo) urlopener.add_handler(byterange.HTTPRangeHandler()) - class statichttpvfs(scmutil.abstractvfs): + class statichttpvfs(vfsmod.abstractvfs): def __init__(self, base): self.base = base @@ -121,9 +122,8 @@ u = util.url(path.rstrip('/') + "/.hg") self.path, authinfo = u.authinfo() - opener = build_opener(ui, authinfo) - self.opener = opener(self.path) - self.vfs = self.opener + vfsclass = build_opener(ui, authinfo) + self.vfs = vfsclass(self.path) self._phasedefaults = [] self.names = namespaces.namespaces() @@ -148,7 +148,7 @@ raise error.RepoError(msg) # setup store - self.store = store.store(requirements, self.path, opener) + self.store = store.store(requirements, self.path, vfsclass) self.spath = self.store.path self.svfs = self.store.opener self.sjoin = self.store.join diff -r ed5b25874d99 -r 4baf79a77afa mercurial/statprof.py --- a/mercurial/statprof.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/statprof.py Fri Mar 24 08:37:26 2017 -0700 @@ -433,6 +433,7 @@ Hotpath = 3 FlameGraph = 4 Json = 5 + Chrome = 6 def display(fp=None, format=3, data=None, **kwargs): '''Print statistics, either to stdout or the given file object.''' @@ -457,10 +458,12 @@ write_to_flame(data, fp, **kwargs) elif format == DisplayFormats.Json: write_to_json(data, fp) + elif format == DisplayFormats.Chrome: + write_to_chrome(data, fp, **kwargs) else: raise Exception("Invalid display format") - if format != DisplayFormats.Json: + if format not in (DisplayFormats.Json, DisplayFormats.Chrome): print('---', file=fp) print('Sample count: %d' % len(data.samples), file=fp) print('Total time: %f seconds' % data.accumulated_time, file=fp) @@ -713,6 +716,23 @@ os.system("perl ~/flamegraph.pl %s > %s" % (path, outputfile)) print("Written to %s" % outputfile, file=fp) +_pathcache = {} +def simplifypath(path): + '''Attempt to make the path to a Python module easier to read by + removing whatever part of the Python search path it was found + on.''' + + if path in _pathcache: + return _pathcache[path] + hgpath = pycompat.fsencode(encoding.__file__).rsplit(os.sep, 2)[0] + for p in [hgpath] + sys.path: + prefix = p + os.sep + if path.startswith(prefix): + path = path[len(prefix):] + break + _pathcache[path] = path + return path + def write_to_json(data, fp): samples = [] @@ -726,6 +746,102 @@ print(json.dumps(samples), file=fp) +def write_to_chrome(data, fp, minthreshold=0.005, maxthreshold=0.999): + samples = [] + laststack = collections.deque() + lastseen = collections.deque() + + # The Chrome tracing format allows us to use a compact stack + # representation to save space. It's fiddly but worth it. + # We maintain a bijection between stack and ID. + stack2id = {} + id2stack = [] # will eventually be rendered + + def stackid(stack): + if not stack: + return + if stack in stack2id: + return stack2id[stack] + parent = stackid(stack[1:]) + myid = len(stack2id) + stack2id[stack] = myid + id2stack.append(dict(category=stack[0][0], name='%s %s' % stack[0])) + if parent is not None: + id2stack[-1].update(parent=parent) + return myid + + def endswith(a, b): + return list(a)[-len(b):] == list(b) + + # The sampling profiler can sample multiple times without + # advancing the clock, potentially causing the Chrome trace viewer + # to render single-pixel columns that we cannot zoom in on. We + # work around this by pretending that zero-duration samples are a + # millisecond in length. + + clamp = 0.001 + + # We provide knobs that by default attempt to filter out stack + # frames that are too noisy: + # + # * A few take almost all execution time. These are usually boring + # setup functions, giving a stack that is deep but uninformative. + # + # * Numerous samples take almost no time, but introduce lots of + # noisy, oft-deep "spines" into a rendered profile. + + blacklist = set() + totaltime = data.samples[-1].time - data.samples[0].time + minthreshold = totaltime * minthreshold + maxthreshold = max(totaltime * maxthreshold, clamp) + + def poplast(): + oldsid = stackid(tuple(laststack)) + oldcat, oldfunc = laststack.popleft() + oldtime, oldidx = lastseen.popleft() + duration = sample.time - oldtime + if minthreshold <= duration <= maxthreshold: + # ensure no zero-duration events + sampletime = max(oldtime + clamp, sample.time) + samples.append(dict(ph='E', name=oldfunc, cat=oldcat, sf=oldsid, + ts=sampletime*1e6, pid=0)) + else: + blacklist.add(oldidx) + + # Much fiddling to synthesize correctly(ish) nested begin/end + # events given only stack snapshots. + + for sample in data.samples: + tos = sample.stack[0] + name = tos.function + path = simplifypath(tos.path) + category = '%s:%d' % (path, tos.lineno) + stack = tuple((('%s:%d' % (simplifypath(frame.path), frame.lineno), + frame.function) for frame in sample.stack)) + qstack = collections.deque(stack) + if laststack == qstack: + continue + while laststack and qstack and laststack[-1] == qstack[-1]: + laststack.pop() + qstack.pop() + while laststack: + poplast() + for f in reversed(qstack): + lastseen.appendleft((sample.time, len(samples))) + laststack.appendleft(f) + path, name = f + sid = stackid(tuple(laststack)) + samples.append(dict(ph='B', name=name, cat=path, ts=sample.time*1e6, + sf=sid, pid=0)) + laststack = collections.deque(stack) + while laststack: + poplast() + events = [s[1] for s in enumerate(samples) if s[0] not in blacklist] + frames = collections.OrderedDict((str(k), v) + for (k,v) in enumerate(id2stack)) + json.dump(dict(traceEvents=events, stackFrames=frames), fp, indent=1) + fp.write('\n') + def printusage(): print(""" The statprof command line allows you to inspect the last profile's results in diff -r ed5b25874d99 -r 4baf79a77afa mercurial/store.py --- a/mercurial/store.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/store.py Fri Mar 24 08:37:26 2017 -0700 @@ -17,8 +17,8 @@ error, parsers, pycompat, - scmutil, util, + vfs as vfsmod, ) # This avoids a collision between a file named foo and a dir named @@ -99,12 +99,8 @@ 'the\\x07quick\\xadshot' ''' e = '_' - if pycompat.ispy3: - xchr = lambda x: bytes([x]) - asciistr = bytes(xrange(127)) - else: - xchr = chr - asciistr = map(chr, xrange(127)) + xchr = pycompat.bytechr + asciistr = list(map(xchr, range(127))) capitals = list(range(ord("A"), ord("Z") + 1)) cmap = dict((x, x) for x in asciistr) @@ -128,7 +124,7 @@ pass else: raise KeyError - return (lambda s: ''.join([cmap[c] for c in s]), + return (lambda s: ''.join([cmap[s[c:c + 1]] for c in xrange(len(s))]), lambda s: ''.join(list(decode(s)))) _encodefname, _decodefname = _buildencodefun() @@ -197,22 +193,22 @@ if not n: continue if dotencode and n[0] in '. ': - n = "~%02x" % ord(n[0]) + n[1:] + n = "~%02x" % ord(n[0:1]) + n[1:] path[i] = n else: l = n.find('.') if l == -1: l = len(n) if ((l == 3 and n[:3] in _winres3) or - (l == 4 and n[3] <= '9' and n[3] >= '1' + (l == 4 and n[3:4] <= '9' and n[3:4] >= '1' and n[:3] in _winres4)): # encode third letter ('aux' -> 'au~78') - ec = "~%02x" % ord(n[2]) + ec = "~%02x" % ord(n[2:3]) n = n[0:2] + ec + n[3:] path[i] = n if n[-1] in '. ': # encode last period or space ('foo...' -> 'foo..~2e') - path[i] = n[:-1] + "~%02x" % ord(n[-1]) + path[i] = n[:-1] + "~%02x" % ord(n[-1:]) return path _maxstorepathlen = 120 @@ -325,7 +321,7 @@ self.createmode = _calcmode(vfs) vfs.createmode = self.createmode self.rawvfs = vfs - self.vfs = scmutil.filtervfs(vfs, encodedir) + self.vfs = vfsmod.filtervfs(vfs, encodedir) self.opener = self.vfs def join(self, f): @@ -398,7 +394,7 @@ self.createmode = _calcmode(vfs) vfs.createmode = self.createmode self.rawvfs = vfs - self.vfs = scmutil.filtervfs(vfs, encodefilename) + self.vfs = vfsmod.filtervfs(vfs, encodefilename) self.opener = self.vfs def datafiles(self): @@ -477,9 +473,9 @@ self._load() return iter(self.entries) -class _fncachevfs(scmutil.abstractvfs, scmutil.auditvfs): +class _fncachevfs(vfsmod.abstractvfs, vfsmod.auditvfs): def __init__(self, vfs, fnc, encode): - scmutil.auditvfs.__init__(self, vfs) + vfsmod.auditvfs.__init__(self, vfs) self.fncache = fnc self.encode = encode diff -r ed5b25874d99 -r 4baf79a77afa mercurial/streamclone.py --- a/mercurial/streamclone.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/streamclone.py Fri Mar 24 08:37:26 2017 -0700 @@ -8,7 +8,6 @@ from __future__ import absolute_import import struct -import time from .i18n import _ from . import ( @@ -297,7 +296,7 @@ (filecount, util.bytecount(bytecount))) handled_bytes = 0 repo.ui.progress(_('clone'), 0, total=bytecount, unit=_('bytes')) - start = time.time() + start = util.timer() # TODO: get rid of (potential) inconsistency # @@ -340,7 +339,7 @@ # streamclone-ed file at next access repo.invalidate(clearfilecache=True) - elapsed = time.time() - start + elapsed = util.timer() - start if elapsed <= 0: elapsed = 0.001 repo.ui.progress(_('clone'), None) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/subrepo.py --- a/mercurial/subrepo.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/subrepo.py Fri Mar 24 08:37:26 2017 -0700 @@ -35,6 +35,7 @@ pycompat, scmutil, util, + vfs as vfsmod, ) hg = None @@ -129,7 +130,7 @@ for pattern, repl in p.items('subpaths'): # Turn r'C:\foo\bar' into r'C:\\foo\\bar' since re.sub # does a string decode. - repl = repl.encode('string-escape') + repl = util.escapestr(repl) # However, we still want to allow back references to go # through unharmed, so we turn r'\\1' into r'\1'. Again, # extra escapes are needed because re.sub string decodes. @@ -547,8 +548,8 @@ """return filename iterator""" raise NotImplementedError - def filedata(self, name): - """return file data""" + def filedata(self, name, decode): + """return file data, optionally passed through repo decoders""" raise NotImplementedError def fileflags(self, name): @@ -563,7 +564,7 @@ """handle the files command for this subrepo""" return 1 - def archive(self, archiver, prefix, match=None): + def archive(self, archiver, prefix, match=None, decode=True): if match is not None: files = [f for f in self.files() if match(f)] else: @@ -577,7 +578,7 @@ mode = 'x' in flags and 0o755 or 0o644 symlink = 'l' in flags archiver.addfile(prefix + self._path + '/' + name, - mode, symlink, self.filedata(name)) + mode, symlink, self.filedata(name, decode)) self.ui.progress(_('archiving (%s)') % relpath, i + 1, unit=_('files'), total=total) self.ui.progress(_('archiving (%s)') % relpath, None) @@ -620,7 +621,7 @@ def wvfs(self): """return vfs to access the working directory of this subrepository """ - return scmutil.vfs(self._ctx.repo().wvfs.join(self._path)) + return vfsmod.vfs(self._ctx.repo().wvfs.join(self._path)) @propertycache def _relpath(self): @@ -682,7 +683,7 @@ @propertycache def _cachestorehashvfs(self): - return scmutil.vfs(self._repo.join('cache/storehash')) + return vfsmod.vfs(self._repo.vfs.join('cache/storehash')) def _readstorehashcache(self, remotepath): '''read the store hash cache for a given remote repository''' @@ -787,7 +788,7 @@ % (inst, subrelpath(self))) @annotatesubrepoerror - def archive(self, archiver, prefix, match=None): + def archive(self, archiver, prefix, match=None, decode=True): self._get(self._state + ('hg',)) total = abstractsubrepo.archive(self, archiver, prefix, match) rev = self._state[1] @@ -795,7 +796,8 @@ for subpath in ctx.substate: s = subrepo(ctx, subpath, True) submatch = matchmod.subdirmatcher(subpath, match) - total += s.archive(archiver, prefix + self._path + '/', submatch) + total += s.archive(archiver, prefix + self._path + '/', submatch, + decode) return total @annotatesubrepoerror @@ -961,9 +963,12 @@ ctx = self._repo[rev] return ctx.manifest().keys() - def filedata(self, name): + def filedata(self, name, decode): rev = self._state[1] - return self._repo[rev][name].data() + data = self._repo[rev][name].data() + if decode: + data = self._repo.wwritedata(name, data) + return data def fileflags(self, name): rev = self._state[1] @@ -1297,7 +1302,7 @@ paths.append(name.encode('utf-8')) return paths - def filedata(self, name): + def filedata(self, name, decode): return self._svncommand(['cat'], name)[0] @@ -1415,6 +1420,10 @@ errpipe = None if self.ui.quiet: errpipe = open(os.devnull, 'w') + if self.ui._colormode and len(commands) and commands[0] == "diff": + # insert the argument in the front, + # the end of git diff arguments is used for paths + commands.insert(1, '--color') p = subprocess.Popen([self._gitexecutable] + commands, bufsize=-1, cwd=cwd, env=env, close_fds=util.closefds, stdout=subprocess.PIPE, stderr=errpipe) @@ -1777,7 +1786,7 @@ else: self.wvfs.unlink(f) - def archive(self, archiver, prefix, match=None): + def archive(self, archiver, prefix, match=None, decode=True): total = 0 source, revision = self._state if not revision: diff -r ed5b25874d99 -r 4baf79a77afa mercurial/tagmerge.py --- a/mercurial/tagmerge.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/tagmerge.py Fri Mar 24 08:37:26 2017 -0700 @@ -169,7 +169,7 @@ # finally we can join the sorted groups to get the final contents of the # merged .hgtags file, and then write it to disk mergedtagstring = '\n'.join([tags for rank, tags in finaltags if tags]) - fp = repo.wfile('.hgtags', 'wb') + fp = repo.wvfs('.hgtags', 'wb') fp.write(mergedtagstring + '\n') fp.close() diff -r ed5b25874d99 -r 4baf79a77afa mercurial/tags.py --- a/mercurial/tags.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/tags.py Fri Mar 24 08:37:26 2017 -0700 @@ -12,9 +12,7 @@ from __future__ import absolute_import -import array import errno -import time from .node import ( bin, @@ -25,11 +23,10 @@ from . import ( encoding, error, + scmutil, util, ) -array = array.array - # Tags computation can be expensive and caches exist to make it fast in # the common case. # @@ -278,8 +275,6 @@ If the cache is not up to date, the caller is responsible for reading tag info from each returned head. (See findglobaltags().) ''' - from . import scmutil # avoid cycle - try: cachefile = repo.vfs(_filename(repo), 'r') # force reading the file for static-http @@ -344,7 +339,7 @@ # potentially expensive search. return ([], {}, valid, None, True) - starttime = time.time() + starttime = util.timer() # Now we have to lookup the .hgtags filenode for every new head. # This is the most expensive part of finding tags, so performance @@ -359,7 +354,7 @@ fnodescache.write() - duration = time.time() - starttime + duration = util.timer() - starttime ui.log('tagscache', '%d/%d cache hits/lookups in %0.4f ' 'seconds\n', @@ -430,13 +425,12 @@ self.lookupcount = 0 self.hitcount = 0 - self._raw = array('c') try: data = repo.vfs.read(_fnodescachefile) except (OSError, IOError): data = "" - self._raw.fromstring(data) + self._raw = bytearray(data) # The end state of self._raw is an array that is of the exact length # required to hold a record for every revision in the repository. @@ -477,7 +471,7 @@ self.lookupcount += 1 offset = rev * _fnodesrecsize - record = self._raw[offset:offset + _fnodesrecsize].tostring() + record = '%s' % self._raw[offset:offset + _fnodesrecsize] properprefix = node[0:4] # Validate and return existing entry. @@ -518,7 +512,7 @@ def _writeentry(self, offset, prefix, fnode): # Slices on array instances only accept other array. - entry = array('c', prefix + fnode) + entry = bytearray(prefix + fnode) self._raw[offset:offset + _fnodesrecsize] = entry # self._dirtyoffset could be None. self._dirtyoffset = min(self._dirtyoffset, offset) or 0 diff -r ed5b25874d99 -r 4baf79a77afa mercurial/templatefilters.py --- a/mercurial/templatefilters.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/templatefilters.py Fri Mar 24 08:37:26 2017 -0700 @@ -345,7 +345,7 @@ @templatefilter('stringescape') def stringescape(text): - return text.encode('string_escape') + return util.escapestr(text) @templatefilter('stringify') def stringify(thing): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/templatekw.py --- a/mercurial/templatekw.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/templatekw.py Fri Mar 24 08:37:26 2017 -0700 @@ -204,6 +204,17 @@ return getrenamed +# default templates internally used for rendering of lists +defaulttempl = { + 'parent': '{rev}:{node|formatnode} ', + 'manifest': '{rev}:{node|formatnode}', + 'file_copy': '{name} ({source})', + 'envvar': '{key}={value}', + 'extra': '{key}={value|stringescape}' +} +# filecopy is preserved for compatibility reasons +defaulttempl['filecopy'] = defaulttempl['file_copy'] + # keywords are callables like: # fn(repo, ctx, templ, cache, revcache, **args) # with: @@ -325,7 +336,7 @@ c = [makemap(k) for k in extras] f = _showlist('extra', c, plural='extras', **args) return _hybrid(f, extras, makemap, - lambda x: '%s=%s' % (x['key'], x['value'])) + lambda x: '%s=%s' % (x['key'], util.escapestr(x['value']))) @templatekeyword('file_adds') def showfileadds(**args): diff -r ed5b25874d99 -r 4baf79a77afa mercurial/templater.py --- a/mercurial/templater.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/templater.py Fri Mar 24 08:37:26 2017 -0700 @@ -13,13 +13,16 @@ from .i18n import _ from . import ( + color, config, + encoding, error, minirst, parser, pycompat, registrar, revset as revsetmod, + revsetlang, templatefilters, templatekw, util, @@ -543,6 +546,19 @@ return templatefilters.fill(text, width, initindent, hangindent) +@templatefunc('formatnode(node)') +def formatnode(context, mapping, args): + """Obtain the preferred form of a changeset hash. (DEPRECATED)""" + if len(args) != 1: + # i18n: "formatnode" is a keyword + raise error.ParseError(_("formatnode expects one argument")) + + ui = mapping['ui'] + node = evalstring(context, mapping, args[0]) + if ui.debugflag: + return node + return templatefilters.short(node) + @templatefunc('pad(text, width[, fillchar=\' \'[, left=False]])') def pad(context, mapping, args): """Pad text with a @@ -561,13 +577,19 @@ fillchar = ' ' if len(args) > 2: fillchar = evalstring(context, mapping, args[2]) + if len(color.stripeffects(fillchar)) != 1: + # i18n: "pad" is a keyword + raise error.ParseError(_("pad() expects a single fill character")) if len(args) > 3: left = evalboolean(context, mapping, args[3]) + fillwidth = width - encoding.colwidth(color.stripeffects(text)) + if fillwidth <= 0: + return text if left: - return text.rjust(width, fillchar) + return fillchar * fillwidth + text else: - return text.ljust(width, fillchar) + return text + fillchar * fillwidth @templatefunc('indent(text, indentchars[, firstline])') def indent(context, mapping, args): @@ -778,7 +800,7 @@ if len(args) > 1: formatargs = [evalfuncarg(context, mapping, a) for a in args[1:]] - revs = query(revsetmod.formatspec(raw, *formatargs)) + revs = query(revsetlang.formatspec(raw, *formatargs)) revs = list(revs) else: revsetcache = mapping['cache'].setdefault("revsetcache", {}) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/templates/gitweb/filelog.tmpl --- a/mercurial/templates/gitweb/filelog.tmpl Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/templates/gitweb/filelog.tmpl Fri Mar 24 08:37:26 2017 -0700 @@ -38,6 +38,8 @@ diff -r ed5b25874d99 -r 4baf79a77afa mercurial/txnutil.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/txnutil.py Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,36 @@ +# txnutil.py - transaction related utilities +# +# Copyright FUJIWARA Katsunori and others +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. + +from __future__ import absolute_import + +import errno + +from . import ( + encoding, +) + +def mayhavepending(root): + '''return whether 'root' may have pending changes, which are + visible to this process. + ''' + return root == encoding.environ.get('HG_PENDING') + +def trypending(root, vfs, filename, **kwargs): + '''Open file to be read according to HG_PENDING environment variable + + This opens '.pending' of specified 'filename' only when HG_PENDING + is equal to 'root'. + + This returns '(fp, is_pending_opened)' tuple. + ''' + if mayhavepending(root): + try: + return (vfs('%s.pending' % filename, **kwargs), True) + except IOError as inst: + if inst.errno != errno.ENOENT: + raise + return (vfs(filename, **kwargs), False) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/ui.py --- a/mercurial/ui.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/ui.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,13 +7,17 @@ from __future__ import absolute_import +import atexit +import collections import contextlib import errno import getpass import inspect import os import re +import signal import socket +import subprocess import sys import tempfile import traceback @@ -22,6 +26,7 @@ from .node import hex from . import ( + color, config, encoding, error, @@ -34,6 +39,10 @@ urlreq = util.urlreq +# for use with str.translate(None, _keepalnum), to keep just alphanumerics +_keepalnum = ''.join(c for c in map(pycompat.bytechr, range(256)) + if not c.isalnum()) + samplehgrcs = { 'user': """# example user config (see 'hg help config' for more info) @@ -42,12 +51,14 @@ # username = Jane Doe username = +# uncomment to colorize command output +# color = auto + [extensions] # uncomment these lines to enable some popular extensions # (see 'hg help extensions' for more info) # -# pager = -# color =""", +# pager =""", 'cloned': """# example repository config (see 'hg help config' for more info) @@ -85,15 +96,38 @@ 'global': """# example system-wide hg config (see 'hg help config' for more info) +[ui] +# uncomment to colorize command output +# color = auto + [extensions] # uncomment these lines to enable some popular extensions # (see 'hg help extensions' for more info) # # blackbox = -# color = # pager =""", } + +class httppasswordmgrdbproxy(object): + """Delays loading urllib2 until it's needed.""" + def __init__(self): + self._mgr = None + + def _get_mgr(self): + if self._mgr is None: + self._mgr = urlreq.httppasswordmgrwithdefaultrealm() + return self._mgr + + def add_password(self, *args, **kwargs): + return self._get_mgr().add_password(*args, **kwargs) + + def find_user_password(self, *args, **kwargs): + return self._get_mgr().find_user_password(*args, **kwargs) + +def _catchterm(*args): + raise error.SignalInterrupt + class ui(object): def __init__(self, src=None): """Create a fresh new ui object if no src given @@ -120,11 +154,19 @@ self.callhooks = True # Insecure server connections requested. self.insecureconnections = False + # Blocked time + self.logblockedtimes = False + # color mode: see mercurial/color.py for possible value + self._colormode = None + self._terminfoparams = {} + self._styles = {} if src: self.fout = src.fout self.ferr = src.ferr self.fin = src.fin + self.pageractive = src.pageractive + self._disablepager = src._disablepager self._tcfg = src._tcfg.copy() self._ucfg = src._ucfg.copy() @@ -134,18 +176,26 @@ self.environ = src.environ self.callhooks = src.callhooks self.insecureconnections = src.insecureconnections + self._colormode = src._colormode + self._terminfoparams = src._terminfoparams.copy() + self._styles = src._styles.copy() + self.fixconfig() self.httppasswordmgrdb = src.httppasswordmgrdb + self._blockedtimes = src._blockedtimes else: self.fout = util.stdout self.ferr = util.stderr self.fin = util.stdin + self.pageractive = False + self._disablepager = False # shared read-only environment self.environ = encoding.environ - self.httppasswordmgrdb = urlreq.httppasswordmgrwithdefaultrealm() + self.httppasswordmgrdb = httppasswordmgrdbproxy() + self._blockedtimes = collections.defaultdict(int) allowed = self.configlist('experimental', 'exportableenviron') if '*' in allowed: @@ -172,7 +222,17 @@ """Clear internal state that shouldn't persist across commands""" if self._progbar: self._progbar.resetstate() # reset last-print time of progress bar - self.httppasswordmgrdb = urlreq.httppasswordmgrwithdefaultrealm() + self.httppasswordmgrdb = httppasswordmgrdbproxy() + + @contextlib.contextmanager + def timeblockedsection(self, key): + # this is open-coded below - search for timeblockedsection to find them + starttime = util.timer() + try: + yield + finally: + self._blockedtimes[key + '_blocked'] += \ + (util.timer() - starttime) * 1000 def formatter(self, topic, opts): return formatter.formatter(self, topic, opts) @@ -224,6 +284,8 @@ del cfg['ui'][k] for k, v in cfg.items('defaults'): del cfg['defaults'][k] + for k, v in cfg.items('commands'): + del cfg['commands'][k] # Don't remove aliases from the configuration if in the exceptionlist if self.plain('alias'): for k, v in cfg.items('alias'): @@ -277,6 +339,7 @@ self._reportuntrusted = self.debugflag or self.configbool("ui", "report_untrusted", True) self.tracebackflag = self.configbool('ui', 'traceback', False) + self.logblockedtimes = self.configbool('ui', 'logblockedtimes') if section in (None, 'trusted'): # update trust information @@ -402,6 +465,41 @@ % (section, name, v)) return b + def configwith(self, convert, section, name, default=None, + desc=None, untrusted=False): + """parse a configuration element with a conversion function + + >>> u = ui(); s = 'foo' + >>> u.setconfig(s, 'float1', '42') + >>> u.configwith(float, s, 'float1') + 42.0 + >>> u.setconfig(s, 'float2', '-4.25') + >>> u.configwith(float, s, 'float2') + -4.25 + >>> u.configwith(float, s, 'unknown', 7) + 7 + >>> u.setconfig(s, 'invalid', 'somevalue') + >>> u.configwith(float, s, 'invalid') + Traceback (most recent call last): + ... + ConfigError: foo.invalid is not a valid float ('somevalue') + >>> u.configwith(float, s, 'invalid', desc='womble') + Traceback (most recent call last): + ... + ConfigError: foo.invalid is not a valid womble ('somevalue') + """ + + v = self.config(section, name, None, untrusted) + if v is None: + return default + try: + return convert(v) + except ValueError: + if desc is None: + desc = convert.__name__ + raise error.ConfigError(_("%s.%s is not a valid %s ('%s')") + % (section, name, desc, v)) + def configint(self, section, name, default=None, untrusted=False): """parse a configuration element as an integer @@ -418,17 +516,11 @@ >>> u.configint(s, 'invalid') Traceback (most recent call last): ... - ConfigError: foo.invalid is not an integer ('somevalue') + ConfigError: foo.invalid is not a valid integer ('somevalue') """ - v = self.config(section, name, None, untrusted) - if v is None: - return default - try: - return int(v) - except ValueError: - raise error.ConfigError(_("%s.%s is not an integer ('%s')") - % (section, name, v)) + return self.configwith(int, section, name, default, 'integer', + untrusted) def configbytes(self, section, name, default=0, untrusted=False): """parse a configuration element as a quantity in bytes @@ -452,7 +544,7 @@ ConfigError: foo.invalid is not a byte quantity ('somevalue') """ - value = self.config(section, name) + value = self.config(section, name, None, untrusted) if value is None: if not isinstance(default, str): return default @@ -472,84 +564,11 @@ >>> u.configlist(s, 'list1') ['this', 'is', 'a small', 'test'] """ - - def _parse_plain(parts, s, offset): - whitespace = False - while offset < len(s) and (s[offset].isspace() or s[offset] == ','): - whitespace = True - offset += 1 - if offset >= len(s): - return None, parts, offset - if whitespace: - parts.append('') - if s[offset] == '"' and not parts[-1]: - return _parse_quote, parts, offset + 1 - elif s[offset] == '"' and parts[-1][-1] == '\\': - parts[-1] = parts[-1][:-1] + s[offset] - return _parse_plain, parts, offset + 1 - parts[-1] += s[offset] - return _parse_plain, parts, offset + 1 - - def _parse_quote(parts, s, offset): - if offset < len(s) and s[offset] == '"': # "" - parts.append('') - offset += 1 - while offset < len(s) and (s[offset].isspace() or - s[offset] == ','): - offset += 1 - return _parse_plain, parts, offset - - while offset < len(s) and s[offset] != '"': - if (s[offset] == '\\' and offset + 1 < len(s) - and s[offset + 1] == '"'): - offset += 1 - parts[-1] += '"' - else: - parts[-1] += s[offset] - offset += 1 - - if offset >= len(s): - real_parts = _configlist(parts[-1]) - if not real_parts: - parts[-1] = '"' - else: - real_parts[0] = '"' + real_parts[0] - parts = parts[:-1] - parts.extend(real_parts) - return None, parts, offset - - offset += 1 - while offset < len(s) and s[offset] in [' ', ',']: - offset += 1 - - if offset < len(s): - if offset + 1 == len(s) and s[offset] == '"': - parts[-1] += '"' - offset += 1 - else: - parts.append('') - else: - return None, parts, offset - - return _parse_plain, parts, offset - - def _configlist(s): - s = s.rstrip(' ,') - if not s: - return [] - parser, parts, offset = _parse_plain, [''], 0 - while parser: - parser, parts, offset = parser(parts, s, offset) - return parts - - result = self.config(section, name, untrusted=untrusted) - if result is None: - result = default or [] - if isinstance(result, bytes): - result = _configlist(result.lstrip(' ,\n')) - if result is None: - result = default or [] - return result + # default is not always a list + if isinstance(default, bytes): + default = config.parselist(default) + return self.configwith(config.parselist, section, name, default or [], + 'list', untrusted) def hasconfig(self, section, name, untrusted=False): return self._data(untrusted).hasitem(section, name) @@ -696,55 +715,202 @@ def write(self, *args, **opts): '''write args to output - By default, this method simply writes to the buffer or stdout, - but extensions or GUI tools may override this method, - write_err(), popbuffer(), and label() to style output from - various parts of hg. + By default, this method simply writes to the buffer or stdout. + Color mode can be set on the UI class to have the output decorated + with color modifier before being written to stdout. - An optional keyword argument, "label", can be passed in. - This should be a string containing label names separated by - space. Label names take the form of "topic.type". For example, - ui.debug() issues a label of "ui.debug". + The color used is controlled by an optional keyword argument, "label". + This should be a string containing label names separated by space. + Label names take the form of "topic.type". For example, ui.debug() + issues a label of "ui.debug". When labeling output for a specific command, a label of "cmdname.type" is recommended. For example, status issues a label of "status.modified" for modified files. ''' if self._buffers and not opts.get('prompt', False): - self._buffers[-1].extend(a for a in args) + if self._bufferapplylabels: + label = opts.get('label', '') + self._buffers[-1].extend(self.label(a, label) for a in args) + else: + self._buffers[-1].extend(args) + elif self._colormode == 'win32': + # windows color printing is its own can of crab, defer to + # the color module and that is it. + color.win32print(self, self._write, *args, **opts) else: - self._progclear() - for a in args: + msgs = args + if self._colormode is not None: + label = opts.get('label', '') + msgs = [self.label(a, label) for a in args] + self._write(*msgs, **opts) + + def _write(self, *msgs, **opts): + self._progclear() + # opencode timeblockedsection because this is a critical path + starttime = util.timer() + try: + for a in msgs: self.fout.write(a) + finally: + self._blockedtimes['stdio_blocked'] += \ + (util.timer() - starttime) * 1000 def write_err(self, *args, **opts): self._progclear() + if self._bufferstates and self._bufferstates[-1][0]: + self.write(*args, **opts) + elif self._colormode == 'win32': + # windows color printing is its own can of crab, defer to + # the color module and that is it. + color.win32print(self, self._write_err, *args, **opts) + else: + msgs = args + if self._colormode is not None: + label = opts.get('label', '') + msgs = [self.label(a, label) for a in args] + self._write_err(*msgs, **opts) + + def _write_err(self, *msgs, **opts): try: - if self._bufferstates and self._bufferstates[-1][0]: - return self.write(*args, **opts) - if not getattr(self.fout, 'closed', False): - self.fout.flush() - for a in args: - self.ferr.write(a) - # stderr may be buffered under win32 when redirected to files, - # including stdout. - if not getattr(self.ferr, 'closed', False): - self.ferr.flush() + with self.timeblockedsection('stdio'): + if not getattr(self.fout, 'closed', False): + self.fout.flush() + for a in msgs: + self.ferr.write(a) + # stderr may be buffered under win32 when redirected to files, + # including stdout. + if not getattr(self.ferr, 'closed', False): + self.ferr.flush() except IOError as inst: if inst.errno not in (errno.EPIPE, errno.EIO, errno.EBADF): raise def flush(self): - try: self.fout.flush() - except (IOError, ValueError): pass - try: self.ferr.flush() - except (IOError, ValueError): pass + # opencode timeblockedsection because this is a critical path + starttime = util.timer() + try: + try: self.fout.flush() + except (IOError, ValueError): pass + try: self.ferr.flush() + except (IOError, ValueError): pass + finally: + self._blockedtimes['stdio_blocked'] += \ + (util.timer() - starttime) * 1000 def _isatty(self, fh): if self.configbool('ui', 'nontty', False): return False return util.isatty(fh) + def disablepager(self): + self._disablepager = True + + def pager(self, command): + """Start a pager for subsequent command output. + + Commands which produce a long stream of output should call + this function to activate the user's preferred pagination + mechanism (which may be no pager). Calling this function + precludes any future use of interactive functionality, such as + prompting the user or activating curses. + + Args: + command: The full, non-aliased name of the command. That is, "log" + not "history, "summary" not "summ", etc. + """ + if (self._disablepager + or self.pageractive + or command in self.configlist('pager', 'ignore') + or not self.configbool('pager', 'enable', True) + or not self.configbool('pager', 'attend-' + command, True) + # TODO: if we want to allow HGPLAINEXCEPT=pager, + # formatted() will need some adjustment. + or not self.formatted() + or self.plain() + # TODO: expose debugger-enabled on the UI object + or '--debugger' in pycompat.sysargv): + # We only want to paginate if the ui appears to be + # interactive, the user didn't say HGPLAIN or + # HGPLAINEXCEPT=pager, and the user didn't specify --debug. + return + + # TODO: add a "system defaults" config section so this default + # of more(1) can be easily replaced with a global + # configuration file. For example, on OS X the sane default is + # less(1), not more(1), and on debian it's + # sensible-pager(1). We should probably also give the system + # default editor command similar treatment. + envpager = encoding.environ.get('PAGER', 'more') + pagercmd = self.config('pager', 'pager', envpager) + if not pagercmd: + return + + if pycompat.osname == 'nt': + # `more` cannot be invoked with shell=False, but `more.com` can. + # Hide this implementation detail from the user, so we can also get + # sane bad PAGER behavior. If args are also given, the space in the + # command line forces shell=True, so that case doesn't need to be + # handled here. + if pagercmd == 'more': + pagercmd = 'more.com' + + self.debug('starting pager for command %r\n' % command) + self.flush() + self.pageractive = True + # Preserve the formatted-ness of the UI. This is important + # because we mess with stdout, which might confuse + # auto-detection of things being formatted. + self.setconfig('ui', 'formatted', self.formatted(), 'pager') + self.setconfig('ui', 'interactive', False, 'pager') + if util.safehasattr(signal, "SIGPIPE"): + signal.signal(signal.SIGPIPE, _catchterm) + self._runpager(pagercmd) + + def _runpager(self, command): + """Actually start the pager and set up file descriptors. + + This is separate in part so that extensions (like chg) can + override how a pager is invoked. + """ + if command == 'cat': + # Save ourselves some work. + return + # If the command doesn't contain any of these characters, we + # assume it's a binary and exec it directly. This means for + # simple pager command configurations, we can degrade + # gracefully and tell the user about their broken pager. + shell = any(c in command for c in "|&;<>()$`\\\"' \t\n*?[#~=%") + try: + pager = subprocess.Popen( + command, shell=shell, bufsize=-1, + close_fds=util.closefds, stdin=subprocess.PIPE, + stdout=util.stdout, stderr=util.stderr) + except OSError as e: + if e.errno == errno.ENOENT and not shell: + self.warn(_("missing pager command '%s', skipping pager\n") + % command) + return + raise + + # back up original file descriptors + stdoutfd = os.dup(util.stdout.fileno()) + stderrfd = os.dup(util.stderr.fileno()) + + os.dup2(pager.stdin.fileno(), util.stdout.fileno()) + if self._isatty(util.stderr): + os.dup2(pager.stdin.fileno(), util.stderr.fileno()) + + @atexit.register + def killpager(): + if util.safehasattr(signal, "SIGINT"): + signal.signal(signal.SIGINT, signal.SIG_IGN) + # restore original fds, closing pager.stdin copies in the process + os.dup2(stdoutfd, util.stdout.fileno()) + os.dup2(stderrfd, util.stderr.fileno()) + pager.stdin.close() + pager.wait() + def interface(self, feature): """what interface to use for interactive console features? @@ -900,7 +1066,8 @@ sys.stdout = self.fout # prompt ' ' must exist; otherwise readline may delete entire line # - http://bugs.python.org/issue12833 - line = raw_input(' ') + with self.timeblockedsection('stdio'): + line = raw_input(' ') sys.stdin = oldin sys.stdout = oldout @@ -980,13 +1147,14 @@ self.write_err(self.label(prompt or _('password: '), 'ui.prompt')) # disable getpass() only if explicitly specified. it's still valid # to interact with tty even if fin is not a tty. - if self.configbool('ui', 'nontty'): - l = self.fin.readline() - if not l: - raise EOFError - return l.rstrip('\n') - else: - return getpass.getpass('') + with self.timeblockedsection('stdio'): + if self.configbool('ui', 'nontty'): + l = self.fin.readline() + if not l: + raise EOFError + return l.rstrip('\n') + else: + return getpass.getpass('') except EOFError: raise error.ResponseExpected() def status(self, *msg, **opts): @@ -995,14 +1163,14 @@ This adds an output label of "ui.status". ''' if not self.quiet: - opts['label'] = opts.get('label', '') + ' ui.status' + opts[r'label'] = opts.get(r'label', '') + ' ui.status' self.write(*msg, **opts) def warn(self, *msg, **opts): '''write warning message to output (stderr) This adds an output label of "ui.warning". ''' - opts['label'] = opts.get('label', '') + ' ui.warning' + opts[r'label'] = opts.get(r'label', '') + ' ui.warning' self.write_err(*msg, **opts) def note(self, *msg, **opts): '''write note to output (if ui.verbose is True) @@ -1010,7 +1178,7 @@ This adds an output label of "ui.note". ''' if self.verbose: - opts['label'] = opts.get('label', '') + ' ui.note' + opts[r'label'] = opts.get(r'label', '') + ' ui.note' self.write(*msg, **opts) def debug(self, *msg, **opts): '''write debug message to output (if ui.debugflag is True) @@ -1018,7 +1186,7 @@ This adds an output label of "ui.debug". ''' if self.debugflag: - opts['label'] = opts.get('label', '') + ' ui.debug' + opts[r'label'] = opts.get(r'label', '') + ' ui.debug' self.write(*msg, **opts) def edit(self, text, user, extra=None, editform=None, pending=None, @@ -1038,8 +1206,8 @@ suffix=extra['suffix'], text=True, dir=rdir) try: - f = os.fdopen(fd, "w") - f.write(text) + f = os.fdopen(fd, pycompat.sysstr("w")) + f.write(encoding.strfromlocal(text)) f.close() environ = {'HGUSER': user} @@ -1058,25 +1226,47 @@ self.system("%s \"%s\"" % (editor, name), environ=environ, - onerr=error.Abort, errprefix=_("edit failed")) + onerr=error.Abort, errprefix=_("edit failed"), + blockedtag='editor') f = open(name) - t = f.read() + t = encoding.strtolocal(f.read()) f.close() finally: os.unlink(name) return t - def system(self, cmd, environ=None, cwd=None, onerr=None, errprefix=None): + def system(self, cmd, environ=None, cwd=None, onerr=None, errprefix=None, + blockedtag=None): '''execute shell command with appropriate output stream. command output will be redirected if fout is not stdout. + + if command fails and onerr is None, return status, else raise onerr + object as exception. ''' + if blockedtag is None: + # Long cmds tend to be because of an absolute path on cmd. Keep + # the tail end instead + cmdsuffix = cmd.translate(None, _keepalnum)[-85:] + blockedtag = 'unknown_system_' + cmdsuffix out = self.fout if any(s[1] for s in self._bufferstates): out = self - return util.system(cmd, environ=environ, cwd=cwd, onerr=onerr, - errprefix=errprefix, out=out) + with self.timeblockedsection(blockedtag): + rc = self._runsystem(cmd, environ=environ, cwd=cwd, out=out) + if rc and onerr: + errmsg = '%s %s' % (os.path.basename(cmd.split(None, 1)[0]), + util.explainexit(rc)[0]) + if errprefix: + errmsg = '%s: %s' % (errprefix, errmsg) + raise onerr(errmsg) + return rc + + def _runsystem(self, cmd, environ, cwd, out): + """actually execute the given shell command (can be overridden by + extensions like chg)""" + return util.system(cmd, environ=environ, cwd=cwd, out=out) def traceback(self, exc=None, force=False): '''print exception traceback if traceback printing enabled or forced. @@ -1099,7 +1289,11 @@ ''.join(exconly)) else: output = traceback.format_exception(exc[0], exc[1], exc[2]) - self.write_err(''.join(output)) + data = r''.join(output) + if pycompat.ispy3: + enc = pycompat.sysstr(encoding.encoding) + data = data.encode(enc, errors=r'replace') + self.write_err(data) return self.tracebackflag or force def geteditor(self): @@ -1180,13 +1374,15 @@ def label(self, msg, label): '''style msg based on supplied label - Like ui.write(), this just returns msg unchanged, but extensions - and GUI tools can override it to allow styling output without - writing it. + If some color mode is enabled, this will add the necessary control + characters to apply such color. In addition, 'debug' color mode adds + markup showing which label affects a piece of text. ui.write(s, 'label') is equivalent to ui.write(ui.label(s, 'label')). ''' + if self._colormode is not None: + return color.colorlabel(self, msg, label) return msg def develwarn(self, msg, stacklevel=1, config=None): @@ -1377,7 +1573,7 @@ self.name = name self.rawloc = rawloc - self.loc = str(u) + self.loc = '%s' % u # When given a raw location but not a symbolic name, validate the # location is valid. diff -r ed5b25874d99 -r 4baf79a77afa mercurial/unionrepo.py --- a/mercurial/unionrepo.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/unionrepo.py Fri Mar 24 08:37:26 2017 -0700 @@ -27,8 +27,8 @@ pathutil, pycompat, revlog, - scmutil, util, + vfs as vfsmod, ) class unionrevlog(revlog.revlog): @@ -39,7 +39,7 @@ # # To differentiate a rev in the second revlog from a rev in the revlog, # we check revision against repotiprev. - opener = scmutil.readonlyvfs(opener) + opener = vfsmod.readonlyvfs(opener) revlog.revlog.__init__(self, opener, indexfile) self.revlog2 = revlog2 diff -r ed5b25874d99 -r 4baf79a77afa mercurial/util.py --- a/mercurial/util.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/util.py Fri Mar 24 08:37:26 2017 -0700 @@ -17,6 +17,7 @@ import bz2 import calendar +import codecs import collections import datetime import errno @@ -59,13 +60,24 @@ stdout = pycompat.stdout stringio = pycompat.stringio urlerr = pycompat.urlerr -urlparse = pycompat.urlparse urlreq = pycompat.urlreq xmlrpclib = pycompat.xmlrpclib +def isatty(fp): + try: + return fp.isatty() + except AttributeError: + return False + +# glibc determines buffering on first write to stdout - if we replace a TTY +# destined stdout with a pipe destined stdout (e.g. pager), we want line +# buffering +if isatty(stdout): + stdout = os.fdopen(stdout.fileno(), pycompat.sysstr('wb'), 1) + if pycompat.osname == 'nt': from . import windows as platform - stdout = platform.winstdout(pycompat.stdout) + stdout = platform.winstdout(stdout) else: from . import posix as platform @@ -123,7 +135,6 @@ testpid = platform.testpid umask = platform.umask unlink = platform.unlink -unlinkpath = platform.unlinkpath username = platform.username # Python compatibility @@ -797,7 +808,7 @@ inname, outname = None, None try: infd, inname = tempfile.mkstemp(prefix='hg-filter-in-') - fp = os.fdopen(infd, 'wb') + fp = os.fdopen(infd, pycompat.sysstr('wb')) fp.write(s) fp.close() outfd, outname = tempfile.mkstemp(prefix='hg-filter-out-') @@ -943,10 +954,7 @@ # executable version (py2exe) doesn't support __file__ datapath = os.path.dirname(pycompat.sysexecutable) else: - datapath = os.path.dirname(__file__) - -if not isinstance(datapath, bytes): - datapath = pycompat.fsencode(datapath) + datapath = os.path.dirname(pycompat.fsencode(__file__)) i18n.setdatapath(datapath) @@ -959,7 +967,7 @@ """ if _hgexecutable is None: hg = encoding.environ.get('HG') - mainmod = sys.modules['__main__'] + mainmod = sys.modules[pycompat.sysstr('__main__')] if hg: _sethgexecutable(hg) elif mainfrozen(): @@ -968,8 +976,9 @@ _sethgexecutable(encoding.environ['EXECUTABLEPATH']) else: _sethgexecutable(pycompat.sysexecutable) - elif os.path.basename(getattr(mainmod, '__file__', '')) == 'hg': - _sethgexecutable(mainmod.__file__) + elif (os.path.basename( + pycompat.fsencode(getattr(mainmod, '__file__', ''))) == 'hg'): + _sethgexecutable(pycompat.fsencode(mainmod.__file__)) else: exe = findexe('hg') or os.path.basename(sys.argv[0]) _sethgexecutable(exe) @@ -999,20 +1008,16 @@ env['HG'] = hgexecutable() return env -def system(cmd, environ=None, cwd=None, onerr=None, errprefix=None, out=None): +def system(cmd, environ=None, cwd=None, out=None): '''enhanced shell command execution. run with environment maybe modified, maybe in different dir. - if command fails and onerr is None, return status, else raise onerr - object as exception. - if out is specified, it is assumed to be a file-like object that has a write() method. stdout and stderr will be redirected to out.''' try: stdout.flush() except Exception: pass - origcmd = cmd cmd = quotecommand(cmd) if pycompat.sysplatform == 'plan9' and (sys.version_info[0] == 2 and sys.version_info[1] < 7): @@ -1036,12 +1041,6 @@ rc = proc.returncode if pycompat.sysplatform == 'OpenVMS' and rc & 1: rc = 0 - if rc and onerr: - errmsg = '%s %s' % (os.path.basename(origcmd.split(None, 1)[0]), - explainexit(rc)[0]) - if errprefix: - errmsg = '%s: %s' % (errprefix, errmsg) - raise onerr(errmsg) return rc def checksignature(func): @@ -1056,6 +1055,21 @@ return check +# a whilelist of known filesystems where hardlink works reliably +_hardlinkfswhitelist = set([ + 'btrfs', + 'ext2', + 'ext3', + 'ext4', + 'hfs', + 'jfs', + 'reiserfs', + 'tmpfs', + 'ufs', + 'xfs', + 'zfs', +]) + def copyfile(src, dest, hardlink=False, copystat=False, checkambig=False): '''copy a file, preserving mode and optionally other stat info like atime/mtime @@ -1072,9 +1086,13 @@ if checkambig: oldstat = checkambig and filestat(dest) unlink(dest) - # hardlinks are problematic on CIFS, quietly ignore this flag - # until we find a way to work around it cleanly (issue4546) - if False and hardlink: + if hardlink: + # Hardlinks are problematic on CIFS (issue4546), do not allow hardlinks + # unless we are confident that dest is on a whitelisted filesystem. + fstype = getfstype(os.path.dirname(dest)) + if fstype not in _hardlinkfswhitelist: + hardlink = False + if hardlink: try: oslink(src, dest) return @@ -1173,7 +1191,7 @@ for n in path.replace('\\', '/').split('/'): if not n: continue - for c in n: + for c in pycompat.bytestr(n): if c in _winreservedchars: return _("filename contains '%s', which is reserved " "on Windows") % c @@ -1191,8 +1209,13 @@ if pycompat.osname == 'nt': checkosfilename = checkwinfilename + timer = time.clock else: checkosfilename = platform.checkosfilename + timer = time.time + +if safehasattr(time, "perf_counter"): + timer = time.perf_counter def makelock(info, pathname): try: @@ -1322,7 +1345,7 @@ seps = seps + pycompat.osaltsep # Protect backslashes. This gets silly very quickly. seps.replace('\\','\\\\') - pattern = remod.compile(r'([^%s]+)|([%s]+)' % (seps, seps)) + pattern = remod.compile(br'([^%s]+)|([%s]+)' % (seps, seps)) dir = os.path.normpath(root) result = [] for part, sep in pattern.findall(name): @@ -1346,6 +1369,13 @@ return ''.join(result) +def getfstype(dirpath): + '''Get the filesystem type name from a directory (best-effort) + + Returns None if we are unsure, or errors like ENOENT, EPERM happen. + ''' + return getattr(osutil, 'getfstype', lambda x: None)(dirpath) + def checknlink(testfile): '''check whether hardlink count reporting works properly''' @@ -1596,6 +1626,26 @@ else: self.close() +def unlinkpath(f, ignoremissing=False): + """unlink and remove the directory if it is empty""" + if ignoremissing: + tryunlink(f) + else: + unlink(f) + # try removing directories that might now be empty + try: + removedirs(os.path.dirname(f)) + except OSError: + pass + +def tryunlink(f): + """Attempt to remove a file, ignoring ENOENT errors.""" + try: + unlink(f) + except OSError as e: + if e.errno != errno.ENOENT: + raise + def makedirs(name, mode=None, notindexed=False): """recursive directory creation with parent mode inheritance @@ -1784,7 +1834,7 @@ # because they use the gmtime() system call which is buggy on Windows # for negative values. t = datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=d) - s = t.strftime(format) + s = encoding.strtolocal(t.strftime(encoding.strfromlocal(format))) return s def shortdate(date=None): @@ -1819,9 +1869,12 @@ return None, s -def strdate(string, format, defaults=[]): +def strdate(string, format, defaults=None): """parse a localized time string and return a (unixtime, offset) tuple. if the string cannot be parsed, ValueError is raised.""" + if defaults is None: + defaults = {} + # NOTE: unixtime = localunixtime + offset offset, date = parsetimezone(string) @@ -2120,6 +2173,14 @@ (1, 1, _('%.0f bytes')), ) +def escapestr(s): + # call underlying function of s.encode('string_escape') directly for + # Python 3 compatibility + return codecs.escape_encode(s)[0] + +def unescapestr(s): + return codecs.escape_decode(s)[0] + def uirepr(s): # Avoid double backslash in Windows path repr() return repr(s).replace('\\\\', '\\') @@ -2233,13 +2294,16 @@ if width <= maxindent: # adjust for weird terminal size width = max(78, maxindent + 1) - line = line.decode(encoding.encoding, encoding.encodingmode) - initindent = initindent.decode(encoding.encoding, encoding.encodingmode) - hangindent = hangindent.decode(encoding.encoding, encoding.encodingmode) + line = line.decode(pycompat.sysstr(encoding.encoding), + pycompat.sysstr(encoding.encodingmode)) + initindent = initindent.decode(pycompat.sysstr(encoding.encoding), + pycompat.sysstr(encoding.encodingmode)) + hangindent = hangindent.decode(pycompat.sysstr(encoding.encoding), + pycompat.sysstr(encoding.encodingmode)) wrapper = MBTextWrapper(width=width, initial_indent=initindent, subsequent_indent=hangindent) - return wrapper.fill(line).encode(encoding.encoding) + return wrapper.fill(line).encode(pycompat.sysstr(encoding.encoding)) if (pyplatform.python_implementation() == 'CPython' and sys.version_info < (3, 0)): @@ -2595,7 +2659,7 @@ 'path', 'fragment'): v = getattr(self, a) if v is not None: - setattr(self, a, pycompat.urlunquote(v)) + setattr(self, a, urlreq.unquote(v)) def __repr__(self): attrs = [] @@ -2640,6 +2704,9 @@ >>> print url(r'file:///D:\data\hg') file:///D:\data\hg """ + return encoding.strfromlocal(self.__bytes__()) + + def __bytes__(self): if self._localpath: s = self.path if self.scheme == 'bundle': @@ -2750,12 +2817,6 @@ u.user = u.passwd = None return str(u) -def isatty(fp): - try: - return fp.isatty() - except AttributeError: - return False - timecount = unitcountfn( (1, 1e3, _('%.0f s')), (100, 1, _('%.1f s')), @@ -2786,13 +2847,13 @@ ''' def wrapper(*args, **kwargs): - start = time.time() + start = timer() indent = 2 _timenesting[0] += indent try: return func(*args, **kwargs) finally: - elapsed = time.time() - start + elapsed = timer() - start _timenesting[0] -= indent stderr.write('%s%s: %s\n' % (' ' * _timenesting[0], func.__name__, @@ -2839,9 +2900,9 @@ results.append(hook(*args)) return results -def getstackframes(skip=0, line=' %-*s in %s\n', fileline='%s:%s'): +def getstackframes(skip=0, line=' %-*s in %s\n', fileline='%s:%s', depth=0): '''Yields lines for a nicely formatted stacktrace. - Skips the 'skip' last entries. + Skips the 'skip' last entries, then return the last 'depth' entries. Each file+linenumber is formatted according to fileline. Each line is formatted according to line. If line is None, it yields: @@ -2852,7 +2913,8 @@ Not be used in production code but very convenient while developing. ''' entries = [(fileline % (fn, ln), func) - for fn, ln, func, _text in traceback.extract_stack()[:-skip - 1]] + for fn, ln, func, _text in traceback.extract_stack()[:-skip - 1] + ][-depth:] if entries: fnmax = max(len(entry[0]) for entry in entries) for fnln, func in entries: @@ -2861,16 +2923,18 @@ else: yield line % (fnmax, fnln, func) -def debugstacktrace(msg='stacktrace', skip=0, f=stderr, otherf=stdout): +def debugstacktrace(msg='stacktrace', skip=0, + f=stderr, otherf=stdout, depth=0): '''Writes a message to f (stderr) with a nicely formatted stacktrace. - Skips the 'skip' last entries. By default it will flush stdout first. + Skips the 'skip' entries closest to the call, then show 'depth' entries. + By default it will flush stdout first. It can be used everywhere and intentionally does not require an ui object. Not be used in production code but very convenient while developing. ''' if otherf: otherf.flush() - f.write('%s at:\n' % msg) - for line in getstackframes(skip + 1): + f.write('%s at:\n' % msg.rstrip()) + for line in getstackframes(skip + 1, depth=depth): f.write(line) f.flush() @@ -2905,7 +2969,7 @@ del dirs[base] def __iter__(self): - return self._dirs.iterkeys() + return iter(self._dirs) def __contains__(self, d): return d in self._dirs diff -r ed5b25874d99 -r 4baf79a77afa mercurial/verify.py --- a/mercurial/verify.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/verify.py Fri Mar 24 08:37:26 2017 -0700 @@ -18,6 +18,7 @@ from . import ( error, revlog, + scmutil, util, ) @@ -32,21 +33,13 @@ f = f.replace('//', '/') return f -def _validpath(repo, path): - """Returns False if a path should NOT be treated as part of a repo. - - For all in-core cases, this returns True, as we have no way for a - path to be mentioned in the history but not actually be - relevant. For narrow clones, this is important because many - filelogs will be missing, and changelog entries may mention - modified files that are outside the narrow scope. - """ - return True - class verifier(object): - def __init__(self, repo): + # The match argument is always None in hg core, but e.g. the narrowhg + # extension will pass in a matcher here. + def __init__(self, repo, match=None): self.repo = repo.unfiltered() self.ui = repo.ui + self.match = match or scmutil.matchall(repo) self.badrevs = set() self.errors = 0 self.warnings = 0 @@ -170,6 +163,7 @@ def _verifychangelog(self): ui = self.ui repo = self.repo + match = self.match cl = repo.changelog ui.status(_("checking changesets\n")) @@ -189,7 +183,7 @@ mflinkrevs.setdefault(changes[0], []).append(i) self.refersmf = True for f in changes[3]: - if _validpath(repo, f): + if match(f): filelinkrevs.setdefault(_normpath(f), []).append(i) except Exception as inst: self.refersmf = True @@ -201,6 +195,7 @@ progress=None): repo = self.repo ui = self.ui + match = self.match mfl = self.repo.manifestlog mf = mfl._revlog.dirlog(dir) @@ -243,12 +238,14 @@ elif f == "/dev/null": # ignore this in very old repos continue fullpath = dir + _normpath(f) - if not _validpath(repo, fullpath): - continue if fl == 't': + if not match.visitdir(fullpath): + continue subdirnodes.setdefault(fullpath + '/', {}).setdefault( fn, []).append(lr) else: + if not match(fullpath): + continue filenodes.setdefault(fullpath, {}).setdefault(fn, lr) except Exception as inst: self.exc(lr, _("reading delta %s") % short(n), inst, label) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/vfs.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mercurial/vfs.py Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,637 @@ +# vfs.py - Mercurial 'vfs' classes +# +# Copyright Matt Mackall +# +# This software may be used and distributed according to the terms of the +# GNU General Public License version 2 or any later version. +from __future__ import absolute_import + +import contextlib +import errno +import os +import shutil +import stat +import tempfile +import threading + +from .i18n import _ +from . import ( + error, + osutil, + pathutil, + pycompat, + util, +) + +class abstractvfs(object): + """Abstract base class; cannot be instantiated""" + + def __init__(self, *args, **kwargs): + '''Prevent instantiation; don't call this from subclasses.''' + raise NotImplementedError('attempted instantiating ' + str(type(self))) + + def tryread(self, path): + '''gracefully return an empty string for missing files''' + try: + return self.read(path) + except IOError as inst: + if inst.errno != errno.ENOENT: + raise + return "" + + def tryreadlines(self, path, mode='rb'): + '''gracefully return an empty array for missing files''' + try: + return self.readlines(path, mode=mode) + except IOError as inst: + if inst.errno != errno.ENOENT: + raise + return [] + + @util.propertycache + def open(self): + '''Open ``path`` file, which is relative to vfs root. + + Newly created directories are marked as "not to be indexed by + the content indexing service", if ``notindexed`` is specified + for "write" mode access. + ''' + return self.__call__ + + def read(self, path): + with self(path, 'rb') as fp: + return fp.read() + + def readlines(self, path, mode='rb'): + with self(path, mode=mode) as fp: + return fp.readlines() + + def write(self, path, data, backgroundclose=False): + with self(path, 'wb', backgroundclose=backgroundclose) as fp: + return fp.write(data) + + def writelines(self, path, data, mode='wb', notindexed=False): + with self(path, mode=mode, notindexed=notindexed) as fp: + return fp.writelines(data) + + def append(self, path, data): + with self(path, 'ab') as fp: + return fp.write(data) + + def basename(self, path): + """return base element of a path (as os.path.basename would do) + + This exists to allow handling of strange encoding if needed.""" + return os.path.basename(path) + + def chmod(self, path, mode): + return os.chmod(self.join(path), mode) + + def dirname(self, path): + """return dirname element of a path (as os.path.dirname would do) + + This exists to allow handling of strange encoding if needed.""" + return os.path.dirname(path) + + def exists(self, path=None): + return os.path.exists(self.join(path)) + + def fstat(self, fp): + return util.fstat(fp) + + def isdir(self, path=None): + return os.path.isdir(self.join(path)) + + def isfile(self, path=None): + return os.path.isfile(self.join(path)) + + def islink(self, path=None): + return os.path.islink(self.join(path)) + + def isfileorlink(self, path=None): + '''return whether path is a regular file or a symlink + + Unlike isfile, this doesn't follow symlinks.''' + try: + st = self.lstat(path) + except OSError: + return False + mode = st.st_mode + return stat.S_ISREG(mode) or stat.S_ISLNK(mode) + + def reljoin(self, *paths): + """join various elements of a path together (as os.path.join would do) + + The vfs base is not injected so that path stay relative. This exists + to allow handling of strange encoding if needed.""" + return os.path.join(*paths) + + def split(self, path): + """split top-most element of a path (as os.path.split would do) + + This exists to allow handling of strange encoding if needed.""" + return os.path.split(path) + + def lexists(self, path=None): + return os.path.lexists(self.join(path)) + + def lstat(self, path=None): + return os.lstat(self.join(path)) + + def listdir(self, path=None): + return os.listdir(self.join(path)) + + def makedir(self, path=None, notindexed=True): + return util.makedir(self.join(path), notindexed) + + def makedirs(self, path=None, mode=None): + return util.makedirs(self.join(path), mode) + + def makelock(self, info, path): + return util.makelock(info, self.join(path)) + + def mkdir(self, path=None): + return os.mkdir(self.join(path)) + + def mkstemp(self, suffix='', prefix='tmp', dir=None, text=False): + fd, name = tempfile.mkstemp(suffix=suffix, prefix=prefix, + dir=self.join(dir), text=text) + dname, fname = util.split(name) + if dir: + return fd, os.path.join(dir, fname) + else: + return fd, fname + + def readdir(self, path=None, stat=None, skip=None): + return osutil.listdir(self.join(path), stat, skip) + + def readlock(self, path): + return util.readlock(self.join(path)) + + def rename(self, src, dst, checkambig=False): + """Rename from src to dst + + checkambig argument is used with util.filestat, and is useful + only if destination file is guarded by any lock + (e.g. repo.lock or repo.wlock). + """ + dstpath = self.join(dst) + oldstat = checkambig and util.filestat(dstpath) + if oldstat and oldstat.stat: + ret = util.rename(self.join(src), dstpath) + newstat = util.filestat(dstpath) + if newstat.isambig(oldstat): + # stat of renamed file is ambiguous to original one + newstat.avoidambig(dstpath, oldstat) + return ret + return util.rename(self.join(src), dstpath) + + def readlink(self, path): + return os.readlink(self.join(path)) + + def removedirs(self, path=None): + """Remove a leaf directory and all empty intermediate ones + """ + return util.removedirs(self.join(path)) + + def rmtree(self, path=None, ignore_errors=False, forcibly=False): + """Remove a directory tree recursively + + If ``forcibly``, this tries to remove READ-ONLY files, too. + """ + if forcibly: + def onerror(function, path, excinfo): + if function is not os.remove: + raise + # read-only files cannot be unlinked under Windows + s = os.stat(path) + if (s.st_mode & stat.S_IWRITE) != 0: + raise + os.chmod(path, stat.S_IMODE(s.st_mode) | stat.S_IWRITE) + os.remove(path) + else: + onerror = None + return shutil.rmtree(self.join(path), + ignore_errors=ignore_errors, onerror=onerror) + + def setflags(self, path, l, x): + return util.setflags(self.join(path), l, x) + + def stat(self, path=None): + return os.stat(self.join(path)) + + def unlink(self, path=None): + return util.unlink(self.join(path)) + + def tryunlink(self, path=None): + """Attempt to remove a file, ignoring missing file errors.""" + util.tryunlink(self.join(path)) + + def unlinkpath(self, path=None, ignoremissing=False): + return util.unlinkpath(self.join(path), ignoremissing=ignoremissing) + + def utime(self, path=None, t=None): + return os.utime(self.join(path), t) + + def walk(self, path=None, onerror=None): + """Yield (dirpath, dirs, files) tuple for each directories under path + + ``dirpath`` is relative one from the root of this vfs. This + uses ``os.sep`` as path separator, even you specify POSIX + style ``path``. + + "The root of this vfs" is represented as empty ``dirpath``. + """ + root = os.path.normpath(self.join(None)) + # when dirpath == root, dirpath[prefixlen:] becomes empty + # because len(dirpath) < prefixlen. + prefixlen = len(pathutil.normasprefix(root)) + for dirpath, dirs, files in os.walk(self.join(path), onerror=onerror): + yield (dirpath[prefixlen:], dirs, files) + + @contextlib.contextmanager + def backgroundclosing(self, ui, expectedcount=-1): + """Allow files to be closed asynchronously. + + When this context manager is active, ``backgroundclose`` can be passed + to ``__call__``/``open`` to result in the file possibly being closed + asynchronously, on a background thread. + """ + # This is an arbitrary restriction and could be changed if we ever + # have a use case. + vfs = getattr(self, 'vfs', self) + if getattr(vfs, '_backgroundfilecloser', None): + raise error.Abort( + _('can only have 1 active background file closer')) + + with backgroundfilecloser(ui, expectedcount=expectedcount) as bfc: + try: + vfs._backgroundfilecloser = bfc + yield bfc + finally: + vfs._backgroundfilecloser = None + +class vfs(abstractvfs): + '''Operate files relative to a base directory + + This class is used to hide the details of COW semantics and + remote file access from higher level code. + ''' + def __init__(self, base, audit=True, expandpath=False, realpath=False): + if expandpath: + base = util.expandpath(base) + if realpath: + base = os.path.realpath(base) + self.base = base + self.mustaudit = audit + self.createmode = None + self._trustnlink = None + + @property + def mustaudit(self): + return self._audit + + @mustaudit.setter + def mustaudit(self, onoff): + self._audit = onoff + if onoff: + self.audit = pathutil.pathauditor(self.base) + else: + self.audit = util.always + + @util.propertycache + def _cansymlink(self): + return util.checklink(self.base) + + @util.propertycache + def _chmod(self): + return util.checkexec(self.base) + + def _fixfilemode(self, name): + if self.createmode is None or not self._chmod: + return + os.chmod(name, self.createmode & 0o666) + + def __call__(self, path, mode="r", text=False, atomictemp=False, + notindexed=False, backgroundclose=False, checkambig=False): + '''Open ``path`` file, which is relative to vfs root. + + Newly created directories are marked as "not to be indexed by + the content indexing service", if ``notindexed`` is specified + for "write" mode access. + + If ``backgroundclose`` is passed, the file may be closed asynchronously. + It can only be used if the ``self.backgroundclosing()`` context manager + is active. This should only be specified if the following criteria hold: + + 1. There is a potential for writing thousands of files. Unless you + are writing thousands of files, the performance benefits of + asynchronously closing files is not realized. + 2. Files are opened exactly once for the ``backgroundclosing`` + active duration and are therefore free of race conditions between + closing a file on a background thread and reopening it. (If the + file were opened multiple times, there could be unflushed data + because the original file handle hasn't been flushed/closed yet.) + + ``checkambig`` argument is passed to atomictemplfile (valid + only for writing), and is useful only if target file is + guarded by any lock (e.g. repo.lock or repo.wlock). + ''' + if self._audit: + r = util.checkosfilename(path) + if r: + raise error.Abort("%s: %r" % (r, path)) + self.audit(path) + f = self.join(path) + + if not text and "b" not in mode: + mode += "b" # for that other OS + + nlink = -1 + if mode not in ('r', 'rb'): + dirname, basename = util.split(f) + # If basename is empty, then the path is malformed because it points + # to a directory. Let the posixfile() call below raise IOError. + if basename: + if atomictemp: + util.makedirs(dirname, self.createmode, notindexed) + return util.atomictempfile(f, mode, self.createmode, + checkambig=checkambig) + try: + if 'w' in mode: + util.unlink(f) + nlink = 0 + else: + # nlinks() may behave differently for files on Windows + # shares if the file is open. + with util.posixfile(f): + nlink = util.nlinks(f) + if nlink < 1: + nlink = 2 # force mktempcopy (issue1922) + except (OSError, IOError) as e: + if e.errno != errno.ENOENT: + raise + nlink = 0 + util.makedirs(dirname, self.createmode, notindexed) + if nlink > 0: + if self._trustnlink is None: + self._trustnlink = nlink > 1 or util.checknlink(f) + if nlink > 1 or not self._trustnlink: + util.rename(util.mktempcopy(f), f) + fp = util.posixfile(f, mode) + if nlink == 0: + self._fixfilemode(f) + + if checkambig: + if mode in ('r', 'rb'): + raise error.Abort(_('implementation error: mode %s is not' + ' valid for checkambig=True') % mode) + fp = checkambigatclosing(fp) + + if backgroundclose: + if not self._backgroundfilecloser: + raise error.Abort(_('backgroundclose can only be used when a ' + 'backgroundclosing context manager is active') + ) + + fp = delayclosedfile(fp, self._backgroundfilecloser) + + return fp + + def symlink(self, src, dst): + self.audit(dst) + linkname = self.join(dst) + util.tryunlink(linkname) + + util.makedirs(os.path.dirname(linkname), self.createmode) + + if self._cansymlink: + try: + os.symlink(src, linkname) + except OSError as err: + raise OSError(err.errno, _('could not symlink to %r: %s') % + (src, err.strerror), linkname) + else: + self.write(dst, src) + + def join(self, path, *insidef): + if path: + return os.path.join(self.base, path, *insidef) + else: + return self.base + +opener = vfs + +class auditvfs(object): + def __init__(self, vfs): + self.vfs = vfs + + @property + def mustaudit(self): + return self.vfs.mustaudit + + @mustaudit.setter + def mustaudit(self, onoff): + self.vfs.mustaudit = onoff + + @property + def options(self): + return self.vfs.options + + @options.setter + def options(self, value): + self.vfs.options = value + +class filtervfs(abstractvfs, auditvfs): + '''Wrapper vfs for filtering filenames with a function.''' + + def __init__(self, vfs, filter): + auditvfs.__init__(self, vfs) + self._filter = filter + + def __call__(self, path, *args, **kwargs): + return self.vfs(self._filter(path), *args, **kwargs) + + def join(self, path, *insidef): + if path: + return self.vfs.join(self._filter(self.vfs.reljoin(path, *insidef))) + else: + return self.vfs.join(path) + +filteropener = filtervfs + +class readonlyvfs(abstractvfs, auditvfs): + '''Wrapper vfs preventing any writing.''' + + def __init__(self, vfs): + auditvfs.__init__(self, vfs) + + def __call__(self, path, mode='r', *args, **kw): + if mode not in ('r', 'rb'): + raise error.Abort(_('this vfs is read only')) + return self.vfs(path, mode, *args, **kw) + + def join(self, path, *insidef): + return self.vfs.join(path, *insidef) + +class closewrapbase(object): + """Base class of wrapper, which hooks closing + + Do not instantiate outside of the vfs layer. + """ + def __init__(self, fh): + object.__setattr__(self, '_origfh', fh) + + def __getattr__(self, attr): + return getattr(self._origfh, attr) + + def __setattr__(self, attr, value): + return setattr(self._origfh, attr, value) + + def __delattr__(self, attr): + return delattr(self._origfh, attr) + + def __enter__(self): + return self._origfh.__enter__() + + def __exit__(self, exc_type, exc_value, exc_tb): + raise NotImplementedError('attempted instantiating ' + str(type(self))) + + def close(self): + raise NotImplementedError('attempted instantiating ' + str(type(self))) + +class delayclosedfile(closewrapbase): + """Proxy for a file object whose close is delayed. + + Do not instantiate outside of the vfs layer. + """ + def __init__(self, fh, closer): + super(delayclosedfile, self).__init__(fh) + object.__setattr__(self, '_closer', closer) + + def __exit__(self, exc_type, exc_value, exc_tb): + self._closer.close(self._origfh) + + def close(self): + self._closer.close(self._origfh) + +class backgroundfilecloser(object): + """Coordinates background closing of file handles on multiple threads.""" + def __init__(self, ui, expectedcount=-1): + self._running = False + self._entered = False + self._threads = [] + self._threadexception = None + + # Only Windows/NTFS has slow file closing. So only enable by default + # on that platform. But allow to be enabled elsewhere for testing. + defaultenabled = pycompat.osname == 'nt' + enabled = ui.configbool('worker', 'backgroundclose', defaultenabled) + + if not enabled: + return + + # There is overhead to starting and stopping the background threads. + # Don't do background processing unless the file count is large enough + # to justify it. + minfilecount = ui.configint('worker', 'backgroundcloseminfilecount', + 2048) + # FUTURE dynamically start background threads after minfilecount closes. + # (We don't currently have any callers that don't know their file count) + if expectedcount > 0 and expectedcount < minfilecount: + return + + # Windows defaults to a limit of 512 open files. A buffer of 128 + # should give us enough headway. + maxqueue = ui.configint('worker', 'backgroundclosemaxqueue', 384) + threadcount = ui.configint('worker', 'backgroundclosethreadcount', 4) + + ui.debug('starting %d threads for background file closing\n' % + threadcount) + + self._queue = util.queue(maxsize=maxqueue) + self._running = True + + for i in range(threadcount): + t = threading.Thread(target=self._worker, name='backgroundcloser') + self._threads.append(t) + t.start() + + def __enter__(self): + self._entered = True + return self + + def __exit__(self, exc_type, exc_value, exc_tb): + self._running = False + + # Wait for threads to finish closing so open files don't linger for + # longer than lifetime of context manager. + for t in self._threads: + t.join() + + def _worker(self): + """Main routine for worker thread.""" + while True: + try: + fh = self._queue.get(block=True, timeout=0.100) + # Need to catch or the thread will terminate and + # we could orphan file descriptors. + try: + fh.close() + except Exception as e: + # Stash so can re-raise from main thread later. + self._threadexception = e + except util.empty: + if not self._running: + break + + def close(self, fh): + """Schedule a file for closing.""" + if not self._entered: + raise error.Abort(_('can only call close() when context manager ' + 'active')) + + # If a background thread encountered an exception, raise now so we fail + # fast. Otherwise we may potentially go on for minutes until the error + # is acted on. + if self._threadexception: + e = self._threadexception + self._threadexception = None + raise e + + # If we're not actively running, close synchronously. + if not self._running: + fh.close() + return + + self._queue.put(fh, block=True, timeout=None) + +class checkambigatclosing(closewrapbase): + """Proxy for a file object, to avoid ambiguity of file stat + + See also util.filestat for detail about "ambiguity of file stat". + + This proxy is useful only if the target file is guarded by any + lock (e.g. repo.lock or repo.wlock) + + Do not instantiate outside of the vfs layer. + """ + def __init__(self, fh): + super(checkambigatclosing, self).__init__(fh) + object.__setattr__(self, '_oldstat', util.filestat(fh.name)) + + def _checkambig(self): + oldstat = self._oldstat + if oldstat.stat: + newstat = util.filestat(self._origfh.name) + if newstat.isambig(oldstat): + # stat of changed file is ambiguous to original one + newstat.avoidambig(self._origfh.name, oldstat) + + def __exit__(self, exc_type, exc_value, exc_tb): + self._origfh.__exit__(exc_type, exc_value, exc_tb) + self._checkambig() + + def close(self): + self._origfh.close() + self._checkambig() diff -r ed5b25874d99 -r 4baf79a77afa mercurial/windows.py --- a/mercurial/windows.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/windows.py Fri Mar 24 08:37:26 2017 -0700 @@ -385,19 +385,6 @@ break head, tail = os.path.split(head) -def unlinkpath(f, ignoremissing=False): - """unlink and remove the directory if it is empty""" - try: - unlink(f) - except OSError as e: - if not (ignoremissing and e.errno == errno.ENOENT): - raise - # try removing directories that might now be empty - try: - removedirs(os.path.dirname(f)) - except OSError: - pass - def rename(src, dst): '''atomically rename file src to dst, replacing dst if it exists''' try: @@ -442,7 +429,7 @@ try: val = winreg.QueryValueEx(winreg.OpenKey(s, key), valname)[0] # never let a Unicode string escape into the wild - return encoding.tolocal(val.encode('UTF-8')) + return encoding.unitolocal(val) except EnvironmentError: pass diff -r ed5b25874d99 -r 4baf79a77afa mercurial/wireproto.py --- a/mercurial/wireproto.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/wireproto.py Fri Mar 24 08:37:26 2017 -0700 @@ -26,6 +26,7 @@ exchange, peer, pushkey as pushkeymod, + pycompat, streamclone, util, ) @@ -735,7 +736,7 @@ depending on the request. e.g. you could advertise URLs for the closest data center given the client's IP address. """ - return repo.opener.tryread('clonebundles.manifest') + return repo.vfs.tryread('clonebundles.manifest') wireprotocaps = ['lookup', 'changegroupsubset', 'branchmap', 'pushkey', 'known', 'getbundle', 'unbundlehash', 'batch'] @@ -839,7 +840,6 @@ raise error.Abort(bundle2requiredmain, hint=bundle2requiredhint) - #chunks = exchange.getbundlechunks(repo, 'serve', **opts) try: chunks = exchange.getbundlechunks(repo, 'serve', **opts) except error.Abort as exc: @@ -900,7 +900,7 @@ def pushkey(repo, proto, namespace, key, old, new): # compatibility with pre-1.8 clients which were accidentally # sending raw binary nodes rather than utf-8-encoded hex - if len(new) == 20 and new.encode('string-escape') != new: + if len(new) == 20 and util.escapestr(new) != new: # looks like it could be a binary node try: new.decode('utf-8') @@ -961,7 +961,7 @@ # write bundle data to temporary file because it can be big fd, tempname = tempfile.mkstemp(prefix='hg-unbundle-') - fp = os.fdopen(fd, 'wb+') + fp = os.fdopen(fd, pycompat.sysstr('wb+')) r = 0 try: proto.getfile(fp) diff -r ed5b25874d99 -r 4baf79a77afa mercurial/worker.py --- a/mercurial/worker.py Thu Mar 23 19:54:59 2017 -0700 +++ b/mercurial/worker.py Fri Mar 24 08:37:26 2017 -0700 @@ -164,7 +164,7 @@ os._exit(0) pids.add(pid) os.close(wfd) - fp = os.fdopen(rfd, 'rb', 0) + fp = os.fdopen(rfd, pycompat.sysstr('rb'), 0) def cleanup(): signal.signal(signal.SIGINT, oldhandler) waitforworkers() diff -r ed5b25874d99 -r 4baf79a77afa setup.py --- a/setup.py Thu Mar 23 19:54:59 2017 -0700 +++ b/setup.py Fri Mar 24 08:37:26 2017 -0700 @@ -63,7 +63,10 @@ import shutil import tempfile from distutils import log -if 'FORCE_SETUPTOOLS' in os.environ: +# We have issues with setuptools on some platforms and builders. Until +# those are resolved, setuptools is opt-in except for platforms where +# we don't have issues. +if os.name == 'nt' or 'FORCE_SETUPTOOLS' in os.environ: from setuptools import setup else: from distutils.core import setup @@ -91,17 +94,13 @@ # We remove hg.bat if we are able to build hg.exe. scripts.append('contrib/win32/hg.bat') -# simplified version of distutils.ccompiler.CCompiler.has_function -# that actually removes its temporary files. -def hasfunction(cc, funcname): +def cancompile(cc, code): tmpdir = tempfile.mkdtemp(prefix='hg-install-') devnull = oldstderr = None try: - fname = os.path.join(tmpdir, 'funcname.c') + fname = os.path.join(tmpdir, 'testcomp.c') f = open(fname, 'w') - f.write('int main(void) {\n') - f.write(' %s();\n' % funcname) - f.write('}\n') + f.write(code) f.close() # Redirect stderr to /dev/null to hide any error messages # from the compiler. @@ -122,6 +121,16 @@ devnull.close() shutil.rmtree(tmpdir) +# simplified version of distutils.ccompiler.CCompiler.has_function +# that actually removes its temporary files. +def hasfunction(cc, funcname): + code = 'int main(void) { %s(); }\n' % funcname + return cancompile(cc, code) + +def hasheader(cc, headername): + code = '#include <%s>\nint main(void) { return 0; }\n' % headername + return cancompile(cc, code) + # py2exe needs to be installed to work try: import py2exe @@ -367,7 +376,7 @@ modulepolicy = 'c' with open("mercurial/__modulepolicy__.py", "w") as f: f.write('# this file is autogenerated by setup.py\n') - f.write('modulepolicy = "%s"\n' % modulepolicy) + f.write('modulepolicy = b"%s"\n' % modulepolicy) build_py.run(self) @@ -581,11 +590,29 @@ osutil_cflags = [] osutil_ldflags = [] -# platform specific macros: HAVE_SETPROCTITLE -for plat, func in [(re.compile('freebsd'), 'setproctitle')]: - if plat.search(sys.platform) and hasfunction(new_compiler(), func): +# platform specific macros +for plat, func in [('bsd', 'setproctitle'), ('bsd|darwin|linux', 'statfs')]: + if re.search(plat, sys.platform) and hasfunction(new_compiler(), func): osutil_cflags.append('-DHAVE_%s' % func.upper()) +for plat, header in [ + ('linux', 'linux/magic.h'), + ('linux', 'sys/vfs.h'), +]: + if re.search(plat, sys.platform) and hasheader(new_compiler(), header): + macro = header.replace('/', '_').replace('.', '_').upper() + osutil_cflags.append('-DHAVE_%s' % macro) + +for plat, macro, code in [ + ('bsd|darwin', 'BSD_STATFS', ''' + #include + #include + int main() { struct statfs s; return sizeof(s.f_fstypename); } + '''), +]: + if re.search(plat, sys.platform) and cancompile(new_compiler(), code): + osutil_cflags.append('-DHAVE_%s' % macro) + if sys.platform == 'darwin': osutil_ldflags += ['-framework', 'ApplicationServices'] @@ -658,7 +685,14 @@ packagedata['mercurial'].append(f) datafiles = [] -setupversion = version + +# distutils expects version to be str/unicode. Converting it to +# unicode on Python 2 still works because it won't contain any +# non-ascii bytes and will be implicitly converted back to bytes +# when operated on. +assert isinstance(version, bytes) +setupversion = version.decode('ascii') + extra = {} if py2exeloaded: diff -r ed5b25874d99 -r 4baf79a77afa tests/dumbhttp.py --- a/tests/dumbhttp.py Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/dumbhttp.py Fri Mar 24 08:37:26 2017 -0700 @@ -7,7 +7,9 @@ """ import optparse +import os import signal +import socket import sys from mercurial import ( @@ -18,11 +20,17 @@ httpserver = util.httpserver OptionParser = optparse.OptionParser +if os.environ.get('HGIPV6', '0') == '1': + class simplehttpserver(httpserver.httpserver): + address_family = socket.AF_INET6 +else: + simplehttpserver = httpserver.httpserver + class simplehttpservice(object): def __init__(self, host, port): self.address = (host, port) def init(self): - self.httpd = httpserver.httpserver( + self.httpd = simplehttpserver( self.address, httpserver.simplehttprequesthandler) def run(self): self.httpd.serve_forever() diff -r ed5b25874d99 -r 4baf79a77afa tests/dummyssh --- a/tests/dummyssh Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/dummyssh Fri Mar 24 08:37:26 2017 -0700 @@ -10,7 +10,7 @@ if sys.argv[1] != "user@dummy": sys.exit(-1) -os.environ["SSH_CLIENT"] = "127.0.0.1 1 2" +os.environ["SSH_CLIENT"] = "%s 1 2" % os.environ.get('LOCALIP', '127.0.0.1') log = open("dummylog", "ab") log.write("Got arguments") diff -r ed5b25874d99 -r 4baf79a77afa tests/hghave.py --- a/tests/hghave.py Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/hghave.py Fri Mar 24 08:37:26 2017 -0700 @@ -346,6 +346,12 @@ finally: os.unlink(fn) +@check("hardlink-whitelisted", "hardlinks on whitelisted filesystems") +def has_hardlink_whitelisted(): + from mercurial import osutil, util + fstype = getattr(osutil, 'getfstype', lambda x: None)('.') + return fstype in util._hardlinkfswhitelist + @check("rmcwd", "can remove current working directory") def has_rmcwd(): ocwd = os.getcwd() @@ -413,6 +419,12 @@ br":1: 're' imported but unused", True) +@check("pylint", "Pylint python linter") +def has_pylint(): + return matchoutput("pylint --help", + br"Usage: pylint", + True) + @check("pygments", "Pygments source highlighting library") def has_pygments(): try: diff -r ed5b25874d99 -r 4baf79a77afa tests/run-tests.py --- a/tests/run-tests.py Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/run-tests.py Fri Mar 24 08:37:26 2017 -0700 @@ -112,18 +112,51 @@ # For Windows support wifexited = getattr(os, "WIFEXITED", lambda x: False) -def checkportisavailable(port): - """return true if a port seems free to bind on localhost""" +# Whether to use IPv6 +def checksocketfamily(name, port=20058): + """return true if we can listen on localhost using family=name + + name should be either 'AF_INET', or 'AF_INET6'. + port being used is okay - EADDRINUSE is considered as successful. + """ + family = getattr(socket, name, None) + if family is None: + return False try: - s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + s = socket.socket(family, socket.SOCK_STREAM) s.bind(('localhost', port)) s.close() return True except socket.error as exc: - if not exc.errno == errno.EADDRINUSE: + if exc.errno == errno.EADDRINUSE: + return True + elif exc.errno in (errno.EADDRNOTAVAIL, errno.EPROTONOSUPPORT): + return False + else: raise + else: return False +# useipv6 will be set by parseargs +useipv6 = None + +def checkportisavailable(port): + """return true if a port seems free to bind on localhost""" + if useipv6: + family = socket.AF_INET6 + else: + family = socket.AF_INET + try: + s = socket.socket(family, socket.SOCK_STREAM) + s.bind(('localhost', port)) + s.close() + return True + except socket.error as exc: + if exc.errno not in (errno.EADDRINUSE, errno.EADDRNOTAVAIL, + errno.EPROTONOSUPPORT): + raise + return False + closefds = os.name == 'posix' def Popen4(cmd, wd, timeout, env=None): processlock.acquire() @@ -269,6 +302,8 @@ help="install and use chg wrapper in place of hg") parser.add_option("--with-chg", metavar="CHG", help="use specified chg wrapper in place of hg") + parser.add_option("--ipv6", action="store_true", + help="prefer IPv6 to IPv4 for network related tests") parser.add_option("-3", "--py3k-warnings", action="store_true", help="enable Py3k warnings on Python 2.6+") # This option should be deleted once test-check-py3-compat.t and other @@ -338,6 +373,14 @@ parser.error('--chg does not work when --with-hg is specified ' '(use --with-chg instead)') + global useipv6 + if options.ipv6: + useipv6 = checksocketfamily('AF_INET6') + else: + # only use IPv6 if IPv4 is unavailable and IPv6 is available + useipv6 = ((not checksocketfamily('AF_INET')) + and checksocketfamily('AF_INET6')) + options.anycoverage = options.cover or options.annotate or options.htmlcov if options.anycoverage: try: @@ -506,7 +549,8 @@ timeout=defaults['timeout'], startport=defaults['port'], extraconfigopts=None, py3kwarnings=False, shell=None, hgcommand=None, - slowtimeout=defaults['slowtimeout'], usechg=False): + slowtimeout=defaults['slowtimeout'], usechg=False, + useipv6=False): """Create a test from parameters. path is the full path to the file defining the test. @@ -554,6 +598,7 @@ self._shell = _bytespath(shell) self._hgcommand = hgcommand or b'hg' self._usechg = usechg + self._useipv6 = useipv6 self._aborted = False self._daemonpids = [] @@ -802,6 +847,7 @@ self._portmap(2), (br'(?m)^(saved backup bundle to .*\.hg)( \(glob\))?$', br'\1 (glob)'), + (br'([^0-9])%s' % re.escape(self._localip()), br'\1$LOCALIP'), ] r.append((self._escapepath(self._testtmp), b'$TESTTMP')) @@ -817,6 +863,12 @@ else: return re.escape(p) + def _localip(self): + if self._useipv6: + return b'::1' + else: + return b'127.0.0.1' + def _getenv(self): """Obtain environment variables to use during test execution.""" def defineport(i): @@ -839,6 +891,11 @@ env["HGUSER"] = "test" env["HGENCODING"] = "ascii" env["HGENCODINGMODE"] = "strict" + env['HGIPV6'] = str(int(self._useipv6)) + + # LOCALIP could be ::1 or 127.0.0.1. Useful for tests that require raw + # IP addresses. + env['LOCALIP'] = self._localip() # Reset some environment variables to well-known values so that # the tests produce repeatable output. @@ -849,6 +906,7 @@ env['TERM'] = 'xterm' for k in ('HG HGPROF CDPATH GREP_OPTIONS http_proxy no_proxy ' + + 'HGPLAIN HGPLAINEXCEPT ' + 'NO_PROXY CHGDEBUG').split(): if k in env: del env[k] @@ -881,6 +939,9 @@ hgrc.write(b'[largefiles]\n') hgrc.write(b'usercache = %s\n' % (os.path.join(self._testtmp, b'.cache/largefiles'))) + hgrc.write(b'[web]\n') + hgrc.write(b'address = localhost\n') + hgrc.write(b'ipv6 = %s\n' % str(self._useipv6).encode('ascii')) for opt in self._extraconfigopts: section, key = opt.split('.', 1) @@ -2288,7 +2349,8 @@ py3kwarnings=self.options.py3k_warnings, shell=self.options.shell, hgcommand=self._hgcommand, - usechg=bool(self.options.with_chg or self.options.chg)) + usechg=bool(self.options.with_chg or self.options.chg), + useipv6=useipv6) t.should_reload = True return t diff -r ed5b25874d99 -r 4baf79a77afa tests/test-addremove-similar.t --- a/tests/test-addremove-similar.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-addremove-similar.t Fri Mar 24 08:37:26 2017 -0700 @@ -55,6 +55,78 @@ $ hg commit -m B +should be sorted by path for stable result + + $ for i in `python $TESTDIR/seq.py 0 9`; do + > cp small-file $i + > done + $ rm small-file + $ hg addremove + adding 0 + adding 1 + adding 2 + adding 3 + adding 4 + adding 5 + adding 6 + adding 7 + adding 8 + adding 9 + removing small-file + recording removal of small-file as rename to 0 (100% similar) + recording removal of small-file as rename to 1 (100% similar) + recording removal of small-file as rename to 2 (100% similar) + recording removal of small-file as rename to 3 (100% similar) + recording removal of small-file as rename to 4 (100% similar) + recording removal of small-file as rename to 5 (100% similar) + recording removal of small-file as rename to 6 (100% similar) + recording removal of small-file as rename to 7 (100% similar) + recording removal of small-file as rename to 8 (100% similar) + recording removal of small-file as rename to 9 (100% similar) + $ hg commit -m '10 same files' + +pick one from many identical files + + $ cp 0 a + $ rm `python $TESTDIR/seq.py 0 9` + $ hg addremove + removing 0 + removing 1 + removing 2 + removing 3 + removing 4 + removing 5 + removing 6 + removing 7 + removing 8 + removing 9 + adding a + recording removal of 0 as rename to a (100% similar) + $ hg revert -aq + +pick one from many similar files + + $ cp 0 a + $ for i in `python $TESTDIR/seq.py 0 9`; do + > echo $i >> $i + > done + $ hg commit -m 'make them slightly different' + $ rm `python $TESTDIR/seq.py 0 9` + $ hg addremove -s50 + removing 0 + removing 1 + removing 2 + removing 3 + removing 4 + removing 5 + removing 6 + removing 7 + removing 8 + removing 9 + adding a + recording removal of 0 as rename to a (99% similar) + $ hg commit -m 'always the same file should be selected' + should all fail $ hg addremove -s foo diff -r ed5b25874d99 -r 4baf79a77afa tests/test-archive.t --- a/tests/test-archive.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-archive.t Fri Mar 24 08:37:26 2017 -0700 @@ -99,7 +99,7 @@ > except AttributeError: > stdout = sys.stdout > try: - > f = util.urlreq.urlopen('http://127.0.0.1:%s/?%s' + > f = util.urlreq.urlopen('http://$LOCALIP:%s/?%s' > % (os.environ['HGPORT'], requeststr)) > stdout.write(f.read()) > except util.urlerr.httperror as e: diff -r ed5b25874d99 -r 4baf79a77afa tests/test-basic.t --- a/tests/test-basic.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-basic.t Fri Mar 24 08:37:26 2017 -0700 @@ -11,6 +11,8 @@ ui.interactive=False ui.mergemarkers=detailed ui.promptecho=True + web.address=localhost + web\.ipv6=(?:True|False) (re) $ hg init t $ cd t diff -r ed5b25874d99 -r 4baf79a77afa tests/test-bdiff.py --- a/tests/test-bdiff.py Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-bdiff.py Fri Mar 24 08:37:26 2017 -0700 @@ -3,8 +3,6 @@ import struct import unittest -import silenttestrunner - from mercurial import ( bdiff, mpatch, @@ -148,4 +146,5 @@ ['a\n', diffreplace(2, 10, 'a\na\na\na\n', '')]) if __name__ == '__main__': + import silenttestrunner silenttestrunner.main(__name__) diff -r ed5b25874d99 -r 4baf79a77afa tests/test-blackbox.t --- a/tests/test-blackbox.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-blackbox.t Fri Mar 24 08:37:26 2017 -0700 @@ -25,7 +25,7 @@ 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000 (5000)> add a 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000 (5000)> add a exited 0 after * seconds (glob) 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000+ (5000)> blackbox - 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000+ (5000)> blackbox --config blackbox.dirty=True exited 0 after * seconds (glob) + 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000+ (5000)> blackbox --config 'blackbox.dirty=True' exited 0 after * seconds (glob) 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000 (5000)> confuse 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000 (5000)> alias 'confuse' expands to 'log --limit 3' 1970/01/01 00:00:00 bob @0000000000000000000000000000000000000000 (5000)> confuse exited 0 after * seconds (glob) diff -r ed5b25874d99 -r 4baf79a77afa tests/test-bookmarks.t --- a/tests/test-bookmarks.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-bookmarks.t Fri Mar 24 08:37:26 2017 -0700 @@ -1,4 +1,5 @@ - $ hg init + $ hg init repo + $ cd repo no bookmarks @@ -630,7 +631,7 @@ Z 2:db815d6d32e6 x y 2:db815d6d32e6 $ hg -R ../cloned-bookmarks-manual-update-with-divergence pull - pulling from $TESTTMP + pulling from $TESTTMP/repo (glob) searching for changes adding changesets adding manifests @@ -895,3 +896,58 @@ $ touch $TESTTMP/unpause $ cd .. + +check whether HG_PENDING makes pending changes only in related +repositories visible to an external hook. + +(emulate a transaction running concurrently by copied +.hg/bookmarks.pending in subsequent test) + + $ cat > $TESTTMP/savepending.sh < cp .hg/bookmarks.pending .hg/bookmarks.pending.saved + > exit 1 # to avoid adding new bookmark for subsequent tests + > EOF + + $ hg init unrelated + $ cd unrelated + $ echo a > a + $ hg add a + $ hg commit -m '#0' + $ hg --config hooks.pretxnclose="sh $TESTTMP/savepending.sh" bookmarks INVISIBLE + transaction abort! + rollback completed + abort: pretxnclose hook exited with status 1 + [255] + $ cp .hg/bookmarks.pending.saved .hg/bookmarks.pending + +(check visible bookmarks while transaction running in repo) + + $ cat > $TESTTMP/checkpending.sh < echo "@repo" + > hg -R $TESTTMP/repo bookmarks + > echo "@unrelated" + > hg -R $TESTTMP/unrelated bookmarks + > exit 1 # to avoid adding new bookmark for subsequent tests + > EOF + + $ cd ../repo + $ hg --config hooks.pretxnclose="sh $TESTTMP/checkpending.sh" bookmarks NEW + @repo + * NEW 6:81dcce76aa0b + X2 1:925d80f479bb + Y 4:125c9a1d6df6 + Z 5:5fb12f0f2d51 + Z@1 1:925d80f479bb + Z@2 4:125c9a1d6df6 + foo 3:9ba5f110a0b3 + foo@1 0:f7b1eb17ad24 + foo@2 2:db815d6d32e6 + four 3:9ba5f110a0b3 + should-end-on-two 2:db815d6d32e6 + x y 2:db815d6d32e6 + @unrelated + no bookmarks set + transaction abort! + rollback completed + abort: pretxnclose hook exited with status 1 + [255] diff -r ed5b25874d99 -r 4baf79a77afa tests/test-branches.t --- a/tests/test-branches.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-branches.t Fri Mar 24 08:37:26 2017 -0700 @@ -1,5 +1,10 @@ $ hg init a $ cd a + +Verify checking branch of nullrev before the cache is created doesnt crash + $ hg log -r 'branch(.)' -T '{branch}\n' + +Basic test $ echo 'root' >root $ hg add root $ hg commit -d '0 0' -m "Adding root node" @@ -519,6 +524,12 @@ $ hg branches --closed -T '{if(closed, "{branch}\n")}' c + $ hg branches -T '{word(0, branch)}: {desc|firstline}\n' + b: reopen branch with a change + a: Adding d branch + a: Adding b branch head 2 + default: Adding root node + Tests of revision branch name caching We rev branch cache is updated automatically. In these tests we use a trick to diff -r ed5b25874d99 -r 4baf79a77afa tests/test-bundle2-exchange.t --- a/tests/test-bundle2-exchange.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-bundle2-exchange.t Fri Mar 24 08:37:26 2017 -0700 @@ -340,7 +340,7 @@ remote: lock: free remote: wlock: free remote: postclose-tip:5fddd98957c8 draft book_5fdd - remote: txnclose hook: HG_BOOKMARK_MOVED=1 HG_BUNDLE2=1 HG_NEW_OBSMARKERS=1 HG_NODE=5fddd98957c8a54a4d436dfe1da9d87f21a1b97b HG_NODE_LAST=5fddd98957c8a54a4d436dfe1da9d87f21a1b97b HG_SOURCE=serve HG_TXNID=TXN:* HG_TXNNAME=serve HG_URL=remote:ssh:127.0.0.1 (glob) + remote: txnclose hook: HG_BOOKMARK_MOVED=1 HG_BUNDLE2=1 HG_NEW_OBSMARKERS=1 HG_NODE=5fddd98957c8a54a4d436dfe1da9d87f21a1b97b HG_NODE_LAST=5fddd98957c8a54a4d436dfe1da9d87f21a1b97b HG_SOURCE=serve HG_TXNID=TXN:* HG_TXNNAME=serve HG_URL=remote:ssh:$LOCALIP (glob) updating bookmark book_5fdd pre-close-tip:02de42196ebe draft book_02de postclose-tip:02de42196ebe draft book_02de @@ -394,7 +394,7 @@ remote: lock: free remote: wlock: free remote: postclose-tip:32af7686d403 public book_32af - remote: txnclose hook: HG_BOOKMARK_MOVED=1 HG_BUNDLE2=1 HG_NEW_OBSMARKERS=1 HG_NODE=32af7686d403cf45b5d95f2d70cebea587ac806a HG_NODE_LAST=32af7686d403cf45b5d95f2d70cebea587ac806a HG_PHASES_MOVED=1 HG_SOURCE=serve HG_TXNID=TXN:* HG_TXNNAME=serve HG_URL=remote:http:127.0.0.1: (glob) + remote: txnclose hook: HG_BOOKMARK_MOVED=1 HG_BUNDLE2=1 HG_NEW_OBSMARKERS=1 HG_NODE=32af7686d403cf45b5d95f2d70cebea587ac806a HG_NODE_LAST=32af7686d403cf45b5d95f2d70cebea587ac806a HG_PHASES_MOVED=1 HG_SOURCE=serve HG_TXNID=TXN:* HG_TXNNAME=serve HG_URL=remote:http:*: (glob) updating bookmark book_32af pre-close-tip:02de42196ebe draft book_02de postclose-tip:02de42196ebe draft book_02de diff -r ed5b25874d99 -r 4baf79a77afa tests/test-bundle2-remote-changegroup.t --- a/tests/test-bundle2-remote-changegroup.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-bundle2-remote-changegroup.t Fri Mar 24 08:37:26 2017 -0700 @@ -39,7 +39,7 @@ > part = bundler.newpart(name, data=data) > return part > - > for line in open(repo.join('bundle2maker'), 'r'): + > for line in open(repo.vfs.join('bundle2maker'), 'r'): > line = line.strip() > try: > verb, args = line.split(None, 1) diff -r ed5b25874d99 -r 4baf79a77afa tests/test-check-code.t --- a/tests/test-check-code.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-check-code.t Fri Mar 24 08:37:26 2017 -0700 @@ -22,13 +22,19 @@ mercurial/encoding.py:61: > for k, v in os.environ.items()) use encoding.environ instead (py3) - mercurial/encoding.py:203: + mercurial/encoding.py:221: > for k, v in os.environ.items()) use encoding.environ instead (py3) Skipping mercurial/httpclient/__init__.py it has no-che?k-code (glob) Skipping mercurial/httpclient/_readers.py it has no-che?k-code (glob) - mercurial/policy.py:45: - > policy = os.environ.get('HGMODULEPOLICY', policy) + mercurial/policy.py:46: + > if 'HGMODULEPOLICY' in os.environ: + use encoding.environ instead (py3) + mercurial/policy.py:47: + > policy = os.environ['HGMODULEPOLICY'].encode('utf-8') + use encoding.environ instead (py3) + mercurial/policy.py:49: + > policy = os.environ.get('HGMODULEPOLICY', policy) use encoding.environ instead (py3) Skipping mercurial/statprof.py it has no-che?k-code (glob) [1] diff -r ed5b25874d99 -r 4baf79a77afa tests/test-check-help.t --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/test-check-help.t Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,28 @@ +#require test-repo + + $ . "$TESTDIR/helpers-testrepo.sh" + + $ cat <<'EOF' > scanhelptopics.py + > from __future__ import absolute_import, print_function + > import re + > import sys + > if sys.platform == "win32": + > import os, msvcrt + > msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY) + > topics = set() + > topicre = re.compile(r':hg:`help ([a-z0-9\-.]+)`') + > for fname in sys.argv: + > with open(fname) as f: + > topics.update(m.group(1) for m in topicre.finditer(f.read())) + > for s in sorted(topics): + > print(s) + > EOF + + $ cd "$TESTDIR"/.. + +Check if ":hg:`help TOPIC`" is valid: +(use "xargs -n1 -t" to see which help commands are executed) + + $ hg files 'glob:{hgext,mercurial}/**/*.py' | sed 's|\\|/|g' \ + > | xargs python "$TESTTMP/scanhelptopics.py" \ + > | xargs -n1 hg help > /dev/null diff -r ed5b25874d99 -r 4baf79a77afa tests/test-check-module-imports.t --- a/tests/test-check-module-imports.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-check-module-imports.t Fri Mar 24 08:37:26 2017 -0700 @@ -3,148 +3,6 @@ $ . "$TESTDIR/helpers-testrepo.sh" $ import_checker="$TESTDIR"/../contrib/import-checker.py -Run the doctests from the import checker, and make sure -it's working correctly. - $ TERM=dumb - $ export TERM - $ python -m doctest $import_checker - -Run additional tests for the import checker - - $ mkdir testpackage - $ touch testpackage/__init__.py - - $ cat > testpackage/multiple.py << EOF - > from __future__ import absolute_import - > import os, sys - > EOF - - $ cat > testpackage/unsorted.py << EOF - > from __future__ import absolute_import - > import sys - > import os - > EOF - - $ cat > testpackage/stdafterlocal.py << EOF - > from __future__ import absolute_import - > from . import unsorted - > import os - > EOF - - $ cat > testpackage/requirerelative.py << EOF - > from __future__ import absolute_import - > import testpackage.unsorted - > EOF - - $ cat > testpackage/importalias.py << EOF - > from __future__ import absolute_import - > import ui - > EOF - - $ cat > testpackage/relativestdlib.py << EOF - > from __future__ import absolute_import - > from .. import os - > EOF - - $ cat > testpackage/symbolimport.py << EOF - > from __future__ import absolute_import - > from .unsorted import foo - > EOF - - $ cat > testpackage/latesymbolimport.py << EOF - > from __future__ import absolute_import - > from . import unsorted - > from mercurial.node import hex - > EOF - - $ cat > testpackage/multiplegroups.py << EOF - > from __future__ import absolute_import - > from . import unsorted - > from . import more - > EOF - - $ mkdir testpackage/subpackage - $ cat > testpackage/subpackage/levelpriority.py << EOF - > from __future__ import absolute_import - > from . import foo - > from .. import parent - > EOF - - $ touch testpackage/subpackage/foo.py - $ cat > testpackage/subpackage/__init__.py << EOF - > from __future__ import absolute_import - > from . import levelpriority # should not cause cycle - > EOF - - $ cat > testpackage/subpackage/localimport.py << EOF - > from __future__ import absolute_import - > from . import foo - > def bar(): - > # should not cause "higher-level import should come first" - > from .. import unsorted - > # but other errors should be detected - > from .. import more - > import testpackage.subpackage.levelpriority - > EOF - - $ cat > testpackage/importmodulefromsub.py << EOF - > from __future__ import absolute_import - > from .subpackage import foo # not a "direct symbol import" - > EOF - - $ cat > testpackage/importsymbolfromsub.py << EOF - > from __future__ import absolute_import - > from .subpackage import foo, nonmodule - > EOF - - $ cat > testpackage/sortedentries.py << EOF - > from __future__ import absolute_import - > from . import ( - > foo, - > bar, - > ) - > EOF - - $ cat > testpackage/importfromalias.py << EOF - > from __future__ import absolute_import - > from . import ui - > EOF - - $ cat > testpackage/importfromrelative.py << EOF - > from __future__ import absolute_import - > from testpackage.unsorted import foo - > EOF - - $ mkdir testpackage2 - $ touch testpackage2/__init__.py - - $ cat > testpackage2/latesymbolimport.py << EOF - > from __future__ import absolute_import - > from testpackage import unsorted - > from mercurial.node import hex - > EOF - - $ python "$import_checker" testpackage*/*.py testpackage/subpackage/*.py - testpackage/importalias.py:2: ui module must be "as" aliased to uimod - testpackage/importfromalias.py:2: ui from testpackage must be "as" aliased to uimod - testpackage/importfromrelative.py:2: import should be relative: testpackage.unsorted - testpackage/importfromrelative.py:2: direct symbol import foo from testpackage.unsorted - testpackage/importsymbolfromsub.py:2: direct symbol import nonmodule from testpackage.subpackage - testpackage/latesymbolimport.py:3: symbol import follows non-symbol import: mercurial.node - testpackage/multiple.py:2: multiple imported names: os, sys - testpackage/multiplegroups.py:3: multiple "from . import" statements - testpackage/relativestdlib.py:2: relative import of stdlib module - testpackage/requirerelative.py:2: import should be relative: testpackage.unsorted - testpackage/sortedentries.py:2: imports from testpackage not lexically sorted: bar < foo - testpackage/stdafterlocal.py:3: stdlib import "os" follows local import: testpackage - testpackage/subpackage/levelpriority.py:3: higher-level import should come first: testpackage - testpackage/subpackage/localimport.py:7: multiple "from .. import" statements - testpackage/subpackage/localimport.py:8: import should be relative: testpackage.subpackage.levelpriority - testpackage/symbolimport.py:2: direct symbol import foo from testpackage.unsorted - testpackage/unsorted.py:3: imports not lexically sorted: os < sys - testpackage2/latesymbolimport.py:3: symbol import follows non-symbol import: mercurial.node - [1] - $ cd "$TESTDIR"/.. There are a handful of cases here that require renaming a module so it @@ -171,7 +29,7 @@ > -X tests/test-verify-repo-operations.py \ > -X tests/test-hook.t \ > -X tests/test-import.t \ - > -X tests/test-check-module-imports.t \ + > -X tests/test-imports-checker.t \ > -X tests/test-commit-interactive.t \ > -X tests/test-contrib-check-code.t \ > -X tests/test-extension.t \ diff -r ed5b25874d99 -r 4baf79a77afa tests/test-check-py3-commands.t --- a/tests/test-check-py3-commands.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-check-py3-commands.t Fri Mar 24 08:37:26 2017 -0700 @@ -3,12 +3,163 @@ This test helps in keeping a track on which commands we can run on Python 3 and see what kind of errors are coming up. The full traceback is hidden to have a stable output. + $ HGBIN=`which hg` $ for cmd in version debuginstall ; do > echo $cmd - > $PYTHON3 `which hg` $cmd 2>&1 2>&1 | tail -1 + > $PYTHON3 $HGBIN $cmd 2>&1 2>&1 | tail -1 > done version warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. debuginstall - TypeError: Can't convert 'bytes' object to str implicitly + no problems detected + +#if test-repo +Make a clone so that any features in the developer's .hg/hgrc that +might confuse Python 3 don't break this test. When we can do commit in +Python 3, we'll stop doing this. We use e76ed1e480ef for the clone +because it has different files than 273ce12ad8f1, so we can test both +`files` from dirstate and `files` loaded from a specific revision. + + $ hg clone -r e76ed1e480ef "`dirname "$TESTDIR"`" testrepo 2>&1 | tail -1 + 15 files updated, 0 files merged, 0 files removed, 0 files unresolved + +Test using -R, which exercises some URL code: + $ $PYTHON3 $HGBIN -R testrepo files -r 273ce12ad8f1 | tail -1 + testrepo/tkmerge + +Now prove `hg files` is reading the whole manifest. We have to grep +out some potential warnings that come from hgrc as yet. + $ cd testrepo + $ $PYTHON3 $HGBIN files -r 273ce12ad8f1 + .hgignore + PKG-INFO + README + hg + mercurial/__init__.py + mercurial/byterange.py + mercurial/fancyopts.py + mercurial/hg.py + mercurial/mdiff.py + mercurial/revlog.py + mercurial/transaction.py + notes.txt + setup.py + tkmerge + + $ $PYTHON3 $HGBIN files -r 273ce12ad8f1 | wc -l + \s*14 (re) + $ $PYTHON3 $HGBIN files | wc -l + \s*15 (re) + +Test if log-like commands work: + + $ $PYTHON3 $HGBIN tip + changeset: 10:e76ed1e480ef + tag: tip + user: oxymoron@cinder.waste.org + date: Tue May 03 23:37:43 2005 -0800 + summary: Fix linking of changeset revs when merging + + + $ $PYTHON3 $HGBIN log -r0 + changeset: 0:9117c6561b0b + user: mpm@selenic.com + date: Tue May 03 13:16:10 2005 -0800 + summary: Add back links from file revisions to changeset revisions + +Test if `hg status` works: + + $ mkdir a b a/1 b/1 b/2 + $ touch in_root a/in_a b/in_b a/1/in_a_1 b/1/in_b_1 b/2/in_b_2 + $ $PYTHON3 $HGBIN status + ? a/1/in_a_1 + ? a/in_a + ? b/1/in_b_1 + ? b/2/in_b_2 + ? b/in_b + ? in_root + + $ cd .. +#endif + +Test if `hg config` works: + + $ $PYTHON3 $HGBIN config + defaults.backout=-d "0 0" + defaults.commit=-d "0 0" + defaults.shelve=--date "0 0" + defaults.tag=-d "0 0" + devel.all-warnings=true + largefiles.usercache=$TESTTMP/.cache/largefiles + ui.slash=True + ui.interactive=False + ui.mergemarkers=detailed + ui.promptecho=True + web.address=localhost + web.ipv6=False + + $ cat > included-hgrc < [extensions] + > babar = imaginary_elephant + > EOF + $ cat >> $HGRCPATH < %include $TESTTMP/included-hgrc + > EOF + $ $PYTHON3 $HGBIN version | tail -1 + *** failed to import extension babar from imaginary_elephant: *: 'imaginary_elephant' (glob) + warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + + $ rm included-hgrc + $ touch included-hgrc + +Test bytes-ness of policy.policy with HGMODULEPOLICY + + $ HGMODULEPOLICY=py + $ export HGMODULEPOLICY + $ $PYTHON3 `which hg` debuginstall 2>&1 2>&1 | tail -1 + no problems detected + +`hg init` can create empty repos + + $ $PYTHON3 `which hg` init py3repo + $ cd py3repo + $ echo "This is the file 'iota'." > iota + $ $PYTHON3 $HGBIN status + ? iota + $ $PYTHON3 $HGBIN add iota + $ $PYTHON3 $HGBIN status + A iota + $ $PYTHON3 $HGBIN commit --message 'commit performed in Python 3' + $ $PYTHON3 $HGBIN status + +TODO: bdiff is broken on Python 3 so we can't do a second commit yet, +when that works remove this rollback command. + $ hg rollback + repository tip rolled back to revision -1 (undo commit) + working directory now based on revision -1 + + $ mkdir A + $ echo "This is the file 'mu'." > A/mu + $ $PYTHON3 $HGBIN addremove + adding A/mu + $ $PYTHON3 $HGBIN status + A A/mu + A iota + $ HGEDITOR='echo message > ' $PYTHON3 $HGBIN commit + $ $PYTHON3 $HGBIN status + +Prove the repo is valid using the Python 2 `hg`: + $ hg verify + checking changesets + checking manifests + crosschecking files in changesets and manifests + checking files + 2 files, 1 changesets, 2 total revisions + $ hg log + changeset: 0:e825505ba339 + tag: tip + user: test + date: Thu Jan 01 00:00:00 1970 +0000 + summary: message + diff -r ed5b25874d99 -r 4baf79a77afa tests/test-check-py3-compat.t --- a/tests/test-check-py3-compat.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-check-py3-compat.t Fri Mar 24 08:37:26 2017 -0700 @@ -7,7 +7,6 @@ contrib/python-zstandard/setup.py not using absolute_import contrib/python-zstandard/setup_zstd.py not using absolute_import contrib/python-zstandard/tests/common.py not using absolute_import - contrib/python-zstandard/tests/test_cffi.py not using absolute_import contrib/python-zstandard/tests/test_compressor.py not using absolute_import contrib/python-zstandard/tests/test_data_structures.py not using absolute_import contrib/python-zstandard/tests/test_decompressor.py not using absolute_import @@ -23,15 +22,15 @@ $ hg files 'set:(**.py) - grep(pygments)' -X hgext/fsmonitor/pywatchman \ > | sed 's|\\|/|g' | xargs $PYTHON3 contrib/check-py3-compat.py \ > | sed 's/[0-9][0-9]*)$/*)/' - hgext/convert/transport.py: error importing: No module named 'svn.client' (error at transport.py:*) + hgext/convert/transport.py: error importing: <*Error> No module named 'svn.client' (error at transport.py:*) (glob) hgext/fsmonitor/state.py: error importing: from __future__ imports must occur at the beginning of the file (__init__.py, line 30) (error at watchmanclient.py:*) hgext/fsmonitor/watchmanclient.py: error importing: from __future__ imports must occur at the beginning of the file (__init__.py, line 30) (error at watchmanclient.py:*) - mercurial/cffi/bdiff.py: error importing: No module named 'mercurial.cffi' (error at check-py3-compat.py:*) - mercurial/cffi/mpatch.py: error importing: No module named 'mercurial.cffi' (error at check-py3-compat.py:*) - mercurial/cffi/osutil.py: error importing: No module named 'mercurial.cffi' (error at check-py3-compat.py:*) - mercurial/scmwindows.py: error importing: No module named 'msvcrt' (error at win32.py:*) - mercurial/win32.py: error importing: No module named 'msvcrt' (error at win32.py:*) - mercurial/windows.py: error importing: No module named 'msvcrt' (error at windows.py:*) + mercurial/cffi/bdiff.py: error importing: <*Error> No module named 'mercurial.cffi' (error at check-py3-compat.py:*) (glob) + mercurial/cffi/mpatch.py: error importing: <*Error> No module named 'mercurial.cffi' (error at check-py3-compat.py:*) (glob) + mercurial/cffi/osutil.py: error importing: <*Error> No module named 'mercurial.cffi' (error at check-py3-compat.py:*) (glob) + mercurial/scmwindows.py: error importing: <*Error> No module named 'msvcrt' (error at win32.py:*) (glob) + mercurial/win32.py: error importing: <*Error> No module named 'msvcrt' (error at win32.py:*) (glob) + mercurial/windows.py: error importing: <*Error> No module named 'msvcrt' (error at windows.py:*) (glob) #endif diff -r ed5b25874d99 -r 4baf79a77afa tests/test-check-pyflakes.t --- a/tests/test-check-pyflakes.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-check-pyflakes.t Fri Mar 24 08:37:26 2017 -0700 @@ -7,9 +7,8 @@ (skipping binary file random-seed) $ hg locate 'set:**.py or grep("^#!.*python")' -X hgext/fsmonitor/pywatchman \ - > -X mercurial/pycompat.py \ + > -X mercurial/pycompat.py -X contrib/python-zstandard \ > 2>/dev/null \ > | xargs pyflakes 2>/dev/null | "$TESTDIR/filterpyflakes.py" - contrib/python-zstandard/tests/test_data_structures.py:107: local variable 'size' is assigned to but never used tests/filterpyflakes.py:39: undefined name 'undefinedname' diff -r ed5b25874d99 -r 4baf79a77afa tests/test-check-pylint.t --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/test-check-pylint.t Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,15 @@ +#require test-repo pylint hg10 + +Run pylint for known rules we care about. +----------------------------------------- + +There should be no recorded failures; fix the codebase before introducing a +new check. + +Current checks: +- W0102: no mutable default argument + + $ touch $TESTTMP/fakerc + $ pylint --rcfile=$TESTTMP/fakerc --disable=all \ + > --enable=W0102 --reports=no \ + > mercurial hgext hgext3rd diff -r ed5b25874d99 -r 4baf79a77afa tests/test-chg.t --- a/tests/test-chg.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-chg.t Fri Mar 24 08:37:26 2017 -0700 @@ -32,6 +32,46 @@ $ cd .. +editor +------ + + $ cat >> pushbuffer.py < def reposetup(ui, repo): + > repo.ui.pushbuffer(subproc=True) + > EOF + + $ chg init editor + $ cd editor + +by default, system() should be redirected to the client: + + $ touch foo + $ CHGDEBUG= HGEDITOR=cat chg ci -Am channeled --edit 2>&1 \ + > | egrep "HG:|run 'cat" + chg: debug: run 'cat "*"' at '$TESTTMP/editor' (glob) + HG: Enter commit message. Lines beginning with 'HG:' are removed. + HG: Leave message empty to abort commit. + HG: -- + HG: user: test + HG: branch 'default' + HG: added foo + +but no redirection should be made if output is captured: + + $ touch bar + $ CHGDEBUG= HGEDITOR=cat chg ci -Am bufferred --edit \ + > --config extensions.pushbuffer="$TESTTMP/pushbuffer.py" 2>&1 \ + > | egrep "HG:|run 'cat" + [1] + +check that commit commands succeeded: + + $ hg log -T '{rev}:{desc}\n' + 1:bufferred + 0:channeled + + $ cd .. + pager ----- diff -r ed5b25874d99 -r 4baf79a77afa tests/test-clone-uncompressed.t --- a/tests/test-clone-uncompressed.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-clone-uncompressed.t Fri Mar 24 08:37:26 2017 -0700 @@ -60,12 +60,12 @@ $ cat > delayer.py < import time - > from mercurial import extensions, scmutil + > from mercurial import extensions, vfs > def __call__(orig, self, path, *args, **kwargs): > if path == 'data/f1.i': > time.sleep(2) > return orig(self, path, *args, **kwargs) - > extensions.wrapfunction(scmutil.vfs, '__call__', __call__) + > extensions.wrapfunction(vfs.vfs, '__call__', __call__) > EOF prepare repo with small and big file to cover both code paths in emitrevlogdata diff -r ed5b25874d99 -r 4baf79a77afa tests/test-clone.t --- a/tests/test-clone.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-clone.t Fri Mar 24 08:37:26 2017 -0700 @@ -579,11 +579,11 @@ No remote source #if windows - $ hg clone http://127.0.0.1:3121/a b + $ hg clone http://$LOCALIP:3121/a b abort: error: * (glob) [255] #else - $ hg clone http://127.0.0.1:3121/a b + $ hg clone http://$LOCALIP:3121/a b abort: error: *refused* (glob) [255] #endif diff -r ed5b25874d99 -r 4baf79a77afa tests/test-command-template.t --- a/tests/test-command-template.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-command-template.t Fri Mar 24 08:37:26 2017 -0700 @@ -3348,6 +3348,18 @@ $ hg log --color=always -l 1 --template '{label(red, "text\n")}' \x1b[0;31mtext\x1b[0m (esc) +color effects can be nested (issue5413) + + $ hg debugtemplate --color=always \ + > '{label(red, "red{label(magenta, "ma{label(cyan, "cyan")}{label(yellow, "yellow")}genta")}")}\n' + \x1b[0;31mred\x1b[0;35mma\x1b[0;36mcyan\x1b[0m\x1b[0;31m\x1b[0;35m\x1b[0;33myellow\x1b[0m\x1b[0;31m\x1b[0;35mgenta\x1b[0m (esc) + +pad() should interact well with color codes (issue5416) + + $ hg debugtemplate --color=always \ + > '{pad(label(red, "red"), 5, label(cyan, "-"))}\n' + \x1b[0;31mred\x1b[0m\x1b[0;36m-\x1b[0m\x1b[0;36m-\x1b[0m (esc) + label should be no-op if color is disabled: $ hg log --color=never -l 1 --template '{label(red, "text\n")}' @@ -3515,6 +3527,15 @@ hg: parse error: pad() expects an integer width [255] +Test invalid fillchar passed to pad function + + $ hg log -r 0 -T '{pad(rev, 10, "")}\n' + hg: parse error: pad() expects a single fill character + [255] + $ hg log -r 0 -T '{pad(rev, 10, "--")}\n' + hg: parse error: pad() expects a single fill character + [255] + Test boolean argument passed to pad function no crash @@ -4100,6 +4121,11 @@ abort: template filter 'utf8' is not compatible with keyword 'rev' [255] +pad width: + + $ HGENCODING=utf-8 hg debugtemplate "{pad('`cat utf-8`', 2, '-')}\n" + \xc3\xa9- (esc) + $ cd .. Test that template function in extension is registered as expected diff -r ed5b25874d99 -r 4baf79a77afa tests/test-commandserver.t --- a/tests/test-commandserver.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-commandserver.t Fri Mar 24 08:37:26 2017 -0700 @@ -199,6 +199,8 @@ ui.usehttp2=true (?) ui.foo=bar ui.nontty=true + web.address=localhost + web\.ipv6=(?:True|False) (re) *** runcommand init foo *** runcommand -R foo showconfig ui defaults defaults.backout=-d "0 0" diff -r ed5b25874d99 -r 4baf79a77afa tests/test-completion.t --- a/tests/test-completion.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-completion.t Fri Mar 24 08:37:26 2017 -0700 @@ -73,6 +73,7 @@ debugbuilddag debugbundle debugcheckstate + debugcolor debugcommands debugcomplete debugconfig @@ -129,6 +130,7 @@ Show the global options $ hg debugcomplete --options | sort + --color --config --cwd --debug @@ -138,6 +140,7 @@ --help --hidden --noninteractive + --pager --profile --quiet --repository @@ -157,6 +160,7 @@ --address --certificate --cmdserver + --color --config --cwd --daemon @@ -171,6 +175,7 @@ --ipv6 --name --noninteractive + --pager --pid-file --port --prefix @@ -223,7 +228,7 @@ serve: accesslog, daemon, daemon-postexec, errorlog, port, address, prefix, name, web-conf, webdir-conf, pid-file, stdio, cmdserver, templates, style, ipv6, certificate status: all, modified, added, removed, deleted, clean, unknown, ignored, no-status, copies, print0, rev, change, include, exclude, subrepos, template summary: remote - update: clean, check, date, rev, tool + update: clean, check, merge, date, rev, tool addremove: similarity, subrepos, include, exclude, dry-run archive: no-decode, prefix, rev, type, subrepos, include, exclude backout: merge, commit, no-commit, parent, rev, edit, tool, include, exclude, message, logfile, date, user @@ -240,6 +245,7 @@ debugbuilddag: mergeable-file, overwritten-file, new-file debugbundle: all, spec debugcheckstate: + debugcolor: style debugcommands: debugcomplete: options debugcreatestreamclonebundle: @@ -352,3 +358,18 @@ fee fie fo + +Test debuglabelcomplete, a deprecated name for debugnamecomplete that is still +used for completions in some shells. + + $ hg debuglabelcomplete + Fum + default + fee + fie + fo + tip + $ hg debuglabelcomplete f + fee + fie + fo diff -r ed5b25874d99 -r 4baf79a77afa tests/test-config.t --- a/tests/test-config.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-config.t Fri Mar 24 08:37:26 2017 -0700 @@ -58,12 +58,12 @@ [ { "name": "Section.KeY", - "source": "*.hgrc:16", (glob) + "source": "*.hgrc:*", (glob) "value": "Case Sensitive" }, { "name": "Section.key", - "source": "*.hgrc:17", (glob) + "source": "*.hgrc:*", (glob) "value": "lower case" } ] @@ -71,7 +71,7 @@ [ { "name": "Section.KeY", - "source": "*.hgrc:16", (glob) + "source": "*.hgrc:*", (glob) "value": "Case Sensitive" } ] @@ -158,3 +158,9 @@ $ hg showconfig paths paths.foo:suboption=~/foo paths.foo=$TESTTMP/foo + +edit failure + + $ HGEDITOR=false hg config --edit + abort: edit failed: false exited with status 1 + [255] diff -r ed5b25874d99 -r 4baf79a77afa tests/test-context.py --- a/tests/test-context.py Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-context.py Fri Mar 24 08:37:26 2017 -0700 @@ -59,7 +59,7 @@ # test performing a diff on a memctx for d in ctxb.diff(ctxa, git=True): - print(d) + print(d, end='') # test safeness and correctness of "ctx.status()" print('= checking context.status():') diff -r ed5b25874d99 -r 4baf79a77afa tests/test-context.py.out --- a/tests/test-context.py.out Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-context.py.out Fri Mar 24 08:37:26 2017 -0700 @@ -4,13 +4,11 @@ UTF-8 : Grüezi! diff --git a/foo b/foo - --- a/foo +++ b/foo @@ -1,1 +1,2 @@ foo +bar - = checking context.status(): == checking workingctx.status: wctx._status= diff -r ed5b25874d99 -r 4baf79a77afa tests/test-contrib-perf.t --- a/tests/test-contrib-perf.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-contrib-perf.t Fri Mar 24 08:37:26 2017 -0700 @@ -109,6 +109,7 @@ perfvolatilesets benchmark the computation of various volatile set perfwalk (no help text available) + perfwrite microbenchmark ui.write (use 'hg help -v perfstatusext' to show built-in aliases and global options) $ hg perfaddremove diff -r ed5b25874d99 -r 4baf79a77afa tests/test-convert-git.t --- a/tests/test-convert-git.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-convert-git.t Fri Mar 24 08:37:26 2017 -0700 @@ -330,7 +330,7 @@ input validation $ hg convert --config convert.git.similarity=foo --datesort git-repo2 fullrepo - abort: convert.git.similarity is not an integer ('foo') + abort: convert.git.similarity is not a valid integer ('foo') [255] $ hg convert --config convert.git.similarity=-1 --datesort git-repo2 fullrepo abort: similarity must be between 0 and 100 diff -r ed5b25874d99 -r 4baf79a77afa tests/test-convert-p4.t --- a/tests/test-convert-p4.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-convert-p4.t Fri Mar 24 08:37:26 2017 -0700 @@ -141,5 +141,23 @@ rev=1 desc="change a" tags="" files="a" rev=0 desc="initial" tags="" files="a b/c" +empty commit message + $ p4 edit a + //depot/test-mercurial-import/a#3 - opened for edit + $ echo aaaaa >> a + $ p4 submit -d "" + Submitting change 6. + Locking 1 files ... + edit //depot/test-mercurial-import/a#4 + Change 6 submitted. + $ hg convert -s p4 $DEPOTPATH dst + scanning source... + reading p4 views + collecting p4 changelists + 6 **empty changelist description** + sorting... + converting... + 0 + exit trap: stopping the p4 server diff -r ed5b25874d99 -r 4baf79a77afa tests/test-debugcommands.t --- a/tests/test-debugcommands.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-debugcommands.t Fri Mar 24 08:37:26 2017 -0700 @@ -116,18 +116,23 @@ $ cat > debugstacktrace.py << EOF > from mercurial.util import debugstacktrace, dst, sys > def f(): - > dst('hello world') + > debugstacktrace(f=sys.stdout) + > g() > def g(): - > f() - > debugstacktrace(skip=-5, f=sys.stdout) - > g() + > dst('hello from g\\n', skip=1) + > h() + > def h(): + > dst('hi ...\\nfrom h hidden in g', 1, depth=2) + > f() > EOF $ python debugstacktrace.py - hello world at: - debugstacktrace.py:7 in * (glob) - debugstacktrace.py:5 in g - debugstacktrace.py:3 in f stacktrace at: - debugstacktrace.py:7 *in * (glob) - debugstacktrace.py:6 *in g (glob) - */util.py:* in debugstacktrace (glob) + debugstacktrace.py:10 in * (glob) + debugstacktrace.py:3 in f + hello from g at: + debugstacktrace.py:10 in * (glob) + debugstacktrace.py:4 in f + hi ... + from h hidden in g at: + debugstacktrace.py:4 in f + debugstacktrace.py:7 in g diff -r ed5b25874d99 -r 4baf79a77afa tests/test-devel-warnings.t --- a/tests/test-devel-warnings.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-devel-warnings.t Fri Mar 24 08:37:26 2017 -0700 @@ -139,7 +139,7 @@ $ hg blackbox -l 9 1970/01/01 00:00:00 bob @cb9a9f314b8b07ba71012fcdbc544b5a4d82ff5b (5000)> devel-warn: revset "oldstyle" uses list instead of smartset (compatibility will be dropped after Mercurial-3.9, update your code.) at: *mercurial/revset.py:* (mfunc) (glob) - 1970/01/01 00:00:00 bob @cb9a9f314b8b07ba71012fcdbc544b5a4d82ff5b (5000)> log -r oldstyle() -T {rev}\n exited 0 after * seconds (glob) + 1970/01/01 00:00:00 bob @cb9a9f314b8b07ba71012fcdbc544b5a4d82ff5b (5000)> log -r 'oldstyle()' -T '{rev}\n' exited 0 after * seconds (glob) 1970/01/01 00:00:00 bob @cb9a9f314b8b07ba71012fcdbc544b5a4d82ff5b (5000)> oldanddeprecated 1970/01/01 00:00:00 bob @cb9a9f314b8b07ba71012fcdbc544b5a4d82ff5b (5000)> devel-warn: foorbar is deprecated, go shopping (compatibility will be dropped after Mercurial-42.1337, update your code.) at: $TESTTMP/buggylocking.py:* (oldanddeprecated) (glob) diff -r ed5b25874d99 -r 4baf79a77afa tests/test-diff-color.t --- a/tests/test-diff-color.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-diff-color.t Fri Mar 24 08:37:26 2017 -0700 @@ -1,10 +1,10 @@ Setup $ cat <> $HGRCPATH + > [ui] + > color = always > [color] > mode = ansi - > [extensions] - > color = > EOF $ hg init repo $ cd repo @@ -35,7 +35,7 @@ default context - $ hg diff --nodates --color=always + $ hg diff --nodates \x1b[0;1mdiff -r cf9f4ba66af2 a\x1b[0m (esc) \x1b[0;31;1m--- a/a\x1b[0m (esc) \x1b[0;32;1m+++ b/a\x1b[0m (esc) @@ -51,7 +51,7 @@ --unified=2 - $ hg diff --nodates -U 2 --color=always + $ hg diff --nodates -U 2 \x1b[0;1mdiff -r cf9f4ba66af2 a\x1b[0m (esc) \x1b[0;31;1m--- a/a\x1b[0m (esc) \x1b[0;32;1m+++ b/a\x1b[0m (esc) @@ -65,10 +65,11 @@ diffstat - $ hg diff --stat --color=always + $ hg diff --stat a | 2 \x1b[0;32m+\x1b[0m\x1b[0;31m-\x1b[0m (esc) 1 files changed, 1 insertions(+), 1 deletions(-) $ cat <> $HGRCPATH + > [extensions] > record = > [ui] > interactive = true @@ -81,7 +82,7 @@ record $ chmod +x a - $ hg record --color=always -m moda a < y > y > EOF @@ -111,7 +112,7 @@ qrecord - $ hg qrecord --color=always -m moda patch < y > y > EOF @@ -151,7 +152,7 @@ $ echo aa >> a $ echo bb >> sub/b - $ hg diff --color=always -S + $ hg diff -S \x1b[0;1mdiff --git a/a b/a\x1b[0m (esc) \x1b[0;31;1m--- a/a\x1b[0m (esc) \x1b[0;32;1m+++ b/a\x1b[0m (esc) @@ -176,7 +177,7 @@ > mid tab > all tabs > EOF - $ hg diff --nodates --color=always + $ hg diff --nodates \x1b[0;1mdiff --git a/a b/a\x1b[0m (esc) \x1b[0;31;1m--- a/a\x1b[0m (esc) \x1b[0;32;1m+++ b/a\x1b[0m (esc) @@ -192,7 +193,7 @@ \x1b[0;32m+\x1b[0m \x1b[0;32mall\x1b[0m \x1b[0;32mtabs\x1b[0m\x1b[0;1;41m \x1b[0m (esc) $ echo "[color]" >> $HGRCPATH $ echo "diff.tab = bold magenta" >> $HGRCPATH - $ hg diff --nodates --color=always + $ hg diff --nodates \x1b[0;1mdiff --git a/a b/a\x1b[0m (esc) \x1b[0;31;1m--- a/a\x1b[0m (esc) \x1b[0;32;1m+++ b/a\x1b[0m (esc) diff -r ed5b25874d99 -r 4baf79a77afa tests/test-doctest.py --- a/tests/test-doctest.py Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-doctest.py Fri Mar 24 08:37:26 2017 -0700 @@ -5,10 +5,16 @@ import doctest import os import sys + +ispy3 = (sys.version_info[0] >= 3) + if 'TERM' in os.environ: del os.environ['TERM'] -def testmod(name, optionflags=0, testtarget=None): +# TODO: migrate doctests to py3 and enable them on both versions +def testmod(name, optionflags=0, testtarget=None, py2=True, py3=False): + if not (not ispy3 and py2 or ispy3 and py3): + return __import__(name) mod = sys.modules[name] if testtarget is not None: @@ -17,6 +23,8 @@ testmod('mercurial.changegroup') testmod('mercurial.changelog') +testmod('mercurial.color') +testmod('mercurial.config') testmod('mercurial.dagparser', optionflags=doctest.NORMALIZE_WHITESPACE) testmod('mercurial.dispatch') testmod('mercurial.encoding') @@ -28,7 +36,9 @@ testmod('mercurial.patch') testmod('mercurial.pathutil') testmod('mercurial.parser') -testmod('mercurial.revset') +testmod('mercurial.pycompat', py3=True) +testmod('mercurial.revsetlang') +testmod('mercurial.smartset') testmod('mercurial.store') testmod('mercurial.subrepo') testmod('mercurial.templatefilters') diff -r ed5b25874d99 -r 4baf79a77afa tests/test-eol.t --- a/tests/test-eol.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-eol.t Fri Mar 24 08:37:26 2017 -0700 @@ -470,6 +470,22 @@ > EOF $ hg commit -m 'consistent' + $ hg init subrepo + $ hg -R subrepo pull -qu . + $ echo "subrepo = subrepo" > .hgsub + $ hg ci -Am "add subrepo" + adding .hgeol + adding .hgsub + $ hg archive -S ../archive + $ find ../archive/* | sort + ../archive/a.txt + ../archive/subrepo + ../archive/subrepo/a.txt + $ cat ../archive/a.txt ../archive/subrepo/a.txt + first\r (esc) + second\r (esc) + first\r (esc) + second\r (esc) Test trailing newline diff -r ed5b25874d99 -r 4baf79a77afa tests/test-extension.t --- a/tests/test-extension.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-extension.t Fri Mar 24 08:37:26 2017 -0700 @@ -532,6 +532,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -543,6 +545,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) @@ -567,6 +571,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -578,6 +584,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) @@ -845,6 +853,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -856,6 +866,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) Make sure that single '-v' option shows help and built-ins only for 'dodo' command $ hg help -v dodo @@ -878,6 +890,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -889,6 +903,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) In case when extension name doesn't match any of its commands, help message should ask for '-v' to get list of built-in aliases @@ -949,6 +965,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -960,6 +978,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) $ hg help -v -e dudu dudu extension - @@ -981,6 +1001,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -992,6 +1014,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) Disabled extension commands: @@ -1089,6 +1113,14 @@ intro=never # never include an introduction message intro=always # always include an introduction message + You can specify a template for flags to be added in subject prefixes. Flags + specified by --flag option are exported as "{flags}" keyword: + + [patchbomb] + flagtemplate = "{separate(' ', + ifeq(branch, 'default', '', branch|upper), + flags)}" + You can set patchbomb to always ask for confirmation by setting "patchbomb.confirm" to true. diff -r ed5b25874d99 -r 4baf79a77afa tests/test-filecache.py --- a/tests/test-filecache.py Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-filecache.py Fri Mar 24 08:37:26 2017 -0700 @@ -10,24 +10,30 @@ from mercurial import ( extensions, hg, - scmutil, + localrepo, ui as uimod, util, + vfs as vfsmod, ) -filecache = scmutil.filecache - class fakerepo(object): def __init__(self): self._filecache = {} - def join(self, p): - return p + class fakevfs(object): + + def join(self, p): + return p + + vfs = fakevfs() + + def unfiltered(self): + return self def sjoin(self, p): return p - @filecache('x', 'y') + @localrepo.repofilecache('x', 'y') def cached(self): print('creating') return 'string from function' @@ -73,7 +79,7 @@ # atomic replace file, size doesn't change # hopefully st_mtime doesn't change as well so this doesn't use the cache # because of inode change - f = scmutil.opener('.')('x', 'w', atomictemp=True) + f = vfsmod.vfs('.')('x', 'w', atomictemp=True) f.write('b') f.close() @@ -97,7 +103,7 @@ # should recreate the object repo.cached - f = scmutil.opener('.')('y', 'w', atomictemp=True) + f = vfsmod.vfs('.')('y', 'w', atomictemp=True) f.write('B') f.close() @@ -105,10 +111,10 @@ print("* file y changed inode") repo.cached - f = scmutil.opener('.')('x', 'w', atomictemp=True) + f = vfsmod.vfs('.')('x', 'w', atomictemp=True) f.write('c') f.close() - f = scmutil.opener('.')('y', 'w', atomictemp=True) + f = vfsmod.vfs('.')('y', 'w', atomictemp=True) f.write('C') f.close() @@ -200,12 +206,12 @@ # st_mtime is advanced multiple times as expected for i in xrange(repetition): # explicit closing - fp = scmutil.checkambigatclosing(open(filename, 'a')) + fp = vfsmod.checkambigatclosing(open(filename, 'a')) fp.write('FOO') fp.close() # implicit closing by "with" statement - with scmutil.checkambigatclosing(open(filename, 'a')) as fp: + with vfsmod.checkambigatclosing(open(filename, 'a')) as fp: fp.write('BAR') newstat = os.stat(filename) diff -r ed5b25874d99 -r 4baf79a77afa tests/test-fileset.t --- a/tests/test-fileset.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-fileset.t Fri Mar 24 08:37:26 2017 -0700 @@ -88,6 +88,35 @@ $ fileset 'copied()' c1 +Test files status in different revisions + + $ hg status -m + M b2 + $ fileset -r0 'revs("wdir()", modified())' --traceback + b2 + $ hg status -a + A c1 + $ fileset -r0 'revs("wdir()", added())' + c1 + $ hg status --change 0 -a + A a1 + A a2 + A b1 + A b2 + $ hg status -mru + M b2 + R a2 + ? c3 + $ fileset -r0 'added() and revs("wdir()", modified() or removed() or unknown())' + b2 + a2 + $ fileset -r0 'added() or revs("wdir()", added())' + a1 + a2 + b1 + b2 + c1 + Test files properties >>> file('bin', 'wb').write('\0a') @@ -367,3 +396,226 @@ $ fileset 'existingcaller()' 2>&1 | tail -1 AssertionError: unexpected existing() invocation + +Test 'revs(...)' +================ + +small reminder of the repository state + + $ hg log -G + @ changeset: 4:160936123545 + | tag: tip + | user: test + | date: Thu Jan 01 00:00:00 1970 +0000 + | summary: subrepo + | + o changeset: 3:9d594e11b8c9 + |\ parent: 2:55b05bdebf36 + | | parent: 1:830839835f98 + | | user: test + | | date: Thu Jan 01 00:00:00 1970 +0000 + | | summary: merge + | | + | o changeset: 2:55b05bdebf36 + | | parent: 0:8a9576c51c1f + | | user: test + | | date: Thu Jan 01 00:00:00 1970 +0000 + | | summary: diverging + | | + o | changeset: 1:830839835f98 + |/ user: test + | date: Thu Jan 01 00:00:00 1970 +0000 + | summary: manychanges + | + o changeset: 0:8a9576c51c1f + user: test + date: Thu Jan 01 00:00:00 1970 +0000 + summary: addfiles + + $ hg status --change 0 + A a1 + A a2 + A b1 + A b2 + $ hg status --change 1 + M b2 + A 1k + A 2k + A b2link + A bin + A c1 + A con.xml + R a2 + $ hg status --change 2 + M b2 + $ hg status --change 3 + M b2 + A 1k + A 2k + A b2link + A bin + A c1 + A con.xml + R a2 + $ hg status --change 4 + A .hgsub + A .hgsubstate + $ hg status + A dos + A mac + A mixed + R con.xml + ! a1 + ? b2.orig + ? c3 + ? unknown + +Test files at -r0 should be filtered by files at wdir +----------------------------------------------------- + + $ fileset -r0 '* and revs("wdir()", *)' + a1 + b1 + b2 + +Test that "revs()" work at all +------------------------------ + + $ fileset "revs('2', modified())" + b2 + +Test that "revs()" work for file missing in the working copy/current context +---------------------------------------------------------------------------- + +(a2 not in working copy) + + $ fileset "revs('0', added())" + a1 + a2 + b1 + b2 + +(none of the file exist in "0") + + $ fileset -r 0 "revs('4', added())" + .hgsub + .hgsubstate + +Call with empty revset +-------------------------- + + $ fileset "revs('2-2', modified())" + +Call with revset matching multiple revs +--------------------------------------- + + $ fileset "revs('0+4', added())" + a1 + a2 + b1 + b2 + .hgsub + .hgsubstate + +overlapping set + + $ fileset "revs('1+2', modified())" + b2 + +test 'status(...)' +================= + +Simple case +----------- + + $ fileset "status(3, 4, added())" + .hgsub + .hgsubstate + +use rev to restrict matched file +----------------------------------------- + + $ hg status --removed --rev 0 --rev 1 + R a2 + $ fileset "status(0, 1, removed())" + a2 + $ fileset "* and status(0, 1, removed())" + $ fileset -r 4 "status(0, 1, removed())" + a2 + $ fileset -r 4 "* and status(0, 1, removed())" + $ fileset "revs('4', * and status(0, 1, removed()))" + $ fileset "revs('0', * and status(0, 1, removed()))" + a2 + +check wdir() +------------ + + $ hg status --removed --rev 4 + R con.xml + $ fileset "status(4, 'wdir()', removed())" + con.xml + + $ hg status --removed --rev 2 + R a2 + $ fileset "status('2', 'wdir()', removed())" + a2 + +test backward status +-------------------- + + $ hg status --removed --rev 0 --rev 4 + R a2 + $ hg status --added --rev 4 --rev 0 + A a2 + $ fileset "status(4, 0, added())" + a2 + +test cross branch status +------------------------ + + $ hg status --added --rev 1 --rev 2 + A a2 + $ fileset "status(1, 2, added())" + a2 + +test with multi revs revset +--------------------------- + $ hg status --added --rev 0:1 --rev 3:4 + A .hgsub + A .hgsubstate + A 1k + A 2k + A b2link + A bin + A c1 + A con.xml + $ fileset "status('0:1', '3:4', added())" + .hgsub + .hgsubstate + 1k + 2k + b2link + bin + c1 + con.xml + +tests with empty value +---------------------- + +Fully empty revset + + $ fileset "status('', '4', added())" + hg: parse error: first argument to status must be a revision + [255] + $ fileset "status('2', '', added())" + hg: parse error: second argument to status must be a revision + [255] + +Empty revset will error at the revset layer + + $ fileset "status(' ', '4', added())" + hg: parse error at 1: not a prefix: end + [255] + $ fileset "status('2', ' ', added())" + hg: parse error at 1: not a prefix: end + [255] diff -r ed5b25874d99 -r 4baf79a77afa tests/test-gendoc-ro.t --- a/tests/test-gendoc-ro.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-gendoc-ro.t Fri Mar 24 08:37:26 2017 -0700 @@ -1,4 +1,9 @@ #require docutils gettext +Error: the current ro localization has some rst defects exposed by +moving pager to core. These two warnings about references are expected +until the localization is corrected. $ $TESTDIR/check-gendoc ro checking for parse errors + gendoc.txt:58: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string. + gendoc.txt:58: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string. diff -r ed5b25874d99 -r 4baf79a77afa tests/test-globalopts.t --- a/tests/test-globalopts.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-globalopts.t Fri Mar 24 08:37:26 2017 -0700 @@ -340,6 +340,7 @@ additional help topics: + color Colorizing Outputs config Configuration Files dates Date Formats diffs Diff Formats @@ -351,6 +352,7 @@ hgweb Configuring hgweb internals Technical implementation topics merge-tools Merge Tools + pager Pager Support patterns File Name Patterns phases Working with Phases revisions Specifying Revisions @@ -421,6 +423,7 @@ additional help topics: + color Colorizing Outputs config Configuration Files dates Date Formats diffs Diff Formats @@ -432,6 +435,7 @@ hgweb Configuring hgweb internals Technical implementation topics merge-tools Merge Tools + pager Pager Support patterns File Name Patterns phases Working with Phases revisions Specifying Revisions diff -r ed5b25874d99 -r 4baf79a77afa tests/test-glog.t --- a/tests/test-glog.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-glog.t Fri Mar 24 08:37:26 2017 -0700 @@ -82,18 +82,18 @@ > } $ cat > printrevset.py < from mercurial import extensions, revset, commands, cmdutil + > from mercurial import extensions, revsetlang, commands, cmdutil > > def uisetup(ui): > def printrevset(orig, ui, repo, *pats, **opts): > if opts.get('print_revset'): > expr = cmdutil.getgraphlogrevs(repo, pats, opts)[1] > if expr: - > tree = revset.parse(expr) + > tree = revsetlang.parse(expr) > else: > tree = [] > ui.write('%r\n' % (opts.get('rev', []),)) - > ui.write(revset.prettyformat(tree) + '\n') + > ui.write(revsetlang.prettyformat(tree) + '\n') > return 0 > return orig(ui, repo, *pats, **opts) > entry = extensions.wrapcommand(commands.table, 'log', printrevset) @@ -3424,3 +3424,39 @@ summary: 0 + $ cd .. + +Multiple roots (issue5440): + + $ hg init multiroots + $ cd multiroots + $ cat < .hg/hgrc + > [ui] + > logtemplate = '{rev} {desc}\n\n' + > EOF + + $ touch foo + $ hg ci -Aqm foo + $ hg co -q null + $ touch bar + $ hg ci -Aqm bar + + $ hg log -Gr null: + @ 1 bar + | + | o 0 foo + |/ + o -1 + + $ hg log -Gr null+0 + o 0 foo + | + o -1 + + $ hg log -Gr null+1 + @ 1 bar + | + o -1 + + + $ cd .. diff -r ed5b25874d99 -r 4baf79a77afa tests/test-graft.t --- a/tests/test-graft.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-graft.t Fri Mar 24 08:37:26 2017 -0700 @@ -582,8 +582,7 @@ 21: fbb6c5cc81002f2b4b49c9d731404688bcae5ade branch=dev convert_revision=7e61b508e709a11d28194a5359bc3532d910af21 - transplant_source=z\xe8F\xe9\x11\x1f\xc8\xf5wEcBP\xc7\xb9\xac (esc) - `h\x9b (esc) + transplant_source=z\xe8F\xe9\x11\x1f\xc8\xf5wEcBP\xc7\xb9\xac\n`h\x9b $ hg -R ../converted log -r 'origin(tip)' changeset: 2:e0213322b2c1 user: test diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hardlinks-whitelisted.t --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/test-hardlinks-whitelisted.t Fri Mar 24 08:37:26 2017 -0700 @@ -0,0 +1,389 @@ +#require hardlink +#require hardlink-whitelisted + +This test is similar to test-hardlinks.t, but will only run on some filesystems +that we are sure to have known good hardlink supports (see issue4546 for an +example where the filesystem claims hardlink support but is actually +problematic). + + $ cat > nlinks.py < import sys + > from mercurial import util + > for f in sorted(sys.stdin.readlines()): + > f = f[:-1] + > print util.nlinks(f), f + > EOF + + $ nlinksdir() + > { + > find $1 -type f | python $TESTTMP/nlinks.py + > } + +Some implementations of cp can't create hardlinks (replaces 'cp -al' on Linux): + + $ cat > linkcp.py < from mercurial import util + > import sys + > util.copyfiles(sys.argv[1], sys.argv[2], hardlink=True) + > EOF + + $ linkcp() + > { + > python $TESTTMP/linkcp.py $1 $2 + > } + +Prepare repo r1: + + $ hg init r1 + $ cd r1 + + $ echo c1 > f1 + $ hg add f1 + $ hg ci -m0 + + $ mkdir d1 + $ cd d1 + $ echo c2 > f2 + $ hg add f2 + $ hg ci -m1 + $ cd ../.. + + $ nlinksdir r1/.hg/store + 1 r1/.hg/store/00changelog.i + 1 r1/.hg/store/00manifest.i + 1 r1/.hg/store/data/d1/f2.i + 1 r1/.hg/store/data/f1.i + 1 r1/.hg/store/fncache + 1 r1/.hg/store/phaseroots + 1 r1/.hg/store/undo + 1 r1/.hg/store/undo.backup.fncache + 1 r1/.hg/store/undo.backupfiles + 1 r1/.hg/store/undo.phaseroots + + +Create hardlinked clone r2: + + $ hg clone -U --debug r1 r2 --config progress.debug=true + linking: 1 + linking: 2 + linking: 3 + linking: 4 + linking: 5 + linking: 6 + linking: 7 + linked 7 files + +Create non-hardlinked clone r3: + + $ hg clone --pull r1 r3 + requesting all changes + adding changesets + adding manifests + adding file changes + added 2 changesets with 2 changes to 2 files + updating to branch default + 2 files updated, 0 files merged, 0 files removed, 0 files unresolved + + +Repos r1 and r2 should now contain hardlinked files: + + $ nlinksdir r1/.hg/store + 2 r1/.hg/store/00changelog.i + 2 r1/.hg/store/00manifest.i + 2 r1/.hg/store/data/d1/f2.i + 2 r1/.hg/store/data/f1.i + 2 r1/.hg/store/fncache + 1 r1/.hg/store/phaseroots + 1 r1/.hg/store/undo + 1 r1/.hg/store/undo.backup.fncache + 1 r1/.hg/store/undo.backupfiles + 1 r1/.hg/store/undo.phaseroots + + $ nlinksdir r2/.hg/store + 2 r2/.hg/store/00changelog.i + 2 r2/.hg/store/00manifest.i + 2 r2/.hg/store/data/d1/f2.i + 2 r2/.hg/store/data/f1.i + 2 r2/.hg/store/fncache + +Repo r3 should not be hardlinked: + + $ nlinksdir r3/.hg/store + 1 r3/.hg/store/00changelog.i + 1 r3/.hg/store/00manifest.i + 1 r3/.hg/store/data/d1/f2.i + 1 r3/.hg/store/data/f1.i + 1 r3/.hg/store/fncache + 1 r3/.hg/store/phaseroots + 1 r3/.hg/store/undo + 1 r3/.hg/store/undo.backupfiles + 1 r3/.hg/store/undo.phaseroots + + +Create a non-inlined filelog in r3: + + $ cd r3/d1 + >>> f = open('data1', 'wb') + >>> for x in range(10000): + ... f.write("%s\n" % str(x)) + >>> f.close() + $ for j in 0 1 2 3 4 5 6 7 8 9; do + > cat data1 >> f2 + > hg commit -m$j + > done + $ cd ../.. + + $ nlinksdir r3/.hg/store + 1 r3/.hg/store/00changelog.i + 1 r3/.hg/store/00manifest.i + 1 r3/.hg/store/data/d1/f2.d + 1 r3/.hg/store/data/d1/f2.i + 1 r3/.hg/store/data/f1.i + 1 r3/.hg/store/fncache + 1 r3/.hg/store/phaseroots + 1 r3/.hg/store/undo + 1 r3/.hg/store/undo.backup.fncache + 1 r3/.hg/store/undo.backup.phaseroots + 1 r3/.hg/store/undo.backupfiles + 1 r3/.hg/store/undo.phaseroots + +Push to repo r1 should break up most hardlinks in r2: + + $ hg -R r2 verify + checking changesets + checking manifests + crosschecking files in changesets and manifests + checking files + 2 files, 2 changesets, 2 total revisions + + $ cd r3 + $ hg push + pushing to $TESTTMP/r1 (glob) + searching for changes + adding changesets + adding manifests + adding file changes + added 10 changesets with 10 changes to 1 files + + $ cd .. + + $ nlinksdir r2/.hg/store + 1 r2/.hg/store/00changelog.i + 1 r2/.hg/store/00manifest.i + 1 r2/.hg/store/data/d1/f2.i + 2 r2/.hg/store/data/f1.i + 2 r2/.hg/store/fncache + + $ hg -R r2 verify + checking changesets + checking manifests + crosschecking files in changesets and manifests + checking files + 2 files, 2 changesets, 2 total revisions + + + $ cd r1 + $ hg up + 1 files updated, 0 files merged, 0 files removed, 0 files unresolved + +Committing a change to f1 in r1 must break up hardlink f1.i in r2: + + $ echo c1c1 >> f1 + $ hg ci -m00 + $ cd .. + + $ nlinksdir r2/.hg/store + 1 r2/.hg/store/00changelog.i + 1 r2/.hg/store/00manifest.i + 1 r2/.hg/store/data/d1/f2.i + 1 r2/.hg/store/data/f1.i + 2 r2/.hg/store/fncache + + + $ cd r3 + $ hg tip --template '{rev}:{node|short}\n' + 11:a6451b6bc41f + $ echo bla > f1 + $ hg ci -m1 + $ cd .. + +Create hardlinked copy r4 of r3 (on Linux, we would call 'cp -al'): + + $ linkcp r3 r4 + +r4 has hardlinks in the working dir (not just inside .hg): + + $ nlinksdir r4 + 2 r4/.hg/00changelog.i + 2 r4/.hg/branch + 2 r4/.hg/cache/branch2-served + 2 r4/.hg/cache/checkisexec + 3 r4/.hg/cache/checklink (?) + ? r4/.hg/cache/checklink-target (glob) + 2 r4/.hg/cache/checknoexec + 2 r4/.hg/cache/rbc-names-v1 + 2 r4/.hg/cache/rbc-revs-v1 + 2 r4/.hg/dirstate + 2 r4/.hg/hgrc + 2 r4/.hg/last-message.txt + 2 r4/.hg/requires + 2 r4/.hg/store/00changelog.i + 2 r4/.hg/store/00manifest.i + 2 r4/.hg/store/data/d1/f2.d + 2 r4/.hg/store/data/d1/f2.i + 2 r4/.hg/store/data/f1.i + 2 r4/.hg/store/fncache + 2 r4/.hg/store/phaseroots + 2 r4/.hg/store/undo + 2 r4/.hg/store/undo.backup.fncache + 2 r4/.hg/store/undo.backup.phaseroots + 2 r4/.hg/store/undo.backupfiles + 2 r4/.hg/store/undo.phaseroots + 4 r4/.hg/undo.backup.dirstate + 2 r4/.hg/undo.bookmarks + 2 r4/.hg/undo.branch + 2 r4/.hg/undo.desc + 4 r4/.hg/undo.dirstate + 2 r4/d1/data1 + 2 r4/d1/f2 + 2 r4/f1 + +Update back to revision 11 in r4 should break hardlink of file f1: + + $ hg -R r4 up 11 + 1 files updated, 0 files merged, 0 files removed, 0 files unresolved + + $ nlinksdir r4 + 2 r4/.hg/00changelog.i + 1 r4/.hg/branch + 2 r4/.hg/cache/branch2-served + 2 r4/.hg/cache/checkisexec + 2 r4/.hg/cache/checklink-target + 2 r4/.hg/cache/checknoexec + 2 r4/.hg/cache/rbc-names-v1 + 2 r4/.hg/cache/rbc-revs-v1 + 1 r4/.hg/dirstate + 2 r4/.hg/hgrc + 2 r4/.hg/last-message.txt + 2 r4/.hg/requires + 2 r4/.hg/store/00changelog.i + 2 r4/.hg/store/00manifest.i + 2 r4/.hg/store/data/d1/f2.d + 2 r4/.hg/store/data/d1/f2.i + 2 r4/.hg/store/data/f1.i + 2 r4/.hg/store/fncache + 2 r4/.hg/store/phaseroots + 2 r4/.hg/store/undo + 2 r4/.hg/store/undo.backup.fncache + 2 r4/.hg/store/undo.backup.phaseroots + 2 r4/.hg/store/undo.backupfiles + 2 r4/.hg/store/undo.phaseroots + 4 r4/.hg/undo.backup.dirstate + 2 r4/.hg/undo.bookmarks + 2 r4/.hg/undo.branch + 2 r4/.hg/undo.desc + 4 r4/.hg/undo.dirstate + 2 r4/d1/data1 + 2 r4/d1/f2 + 1 r4/f1 + + +Test hardlinking outside hg: + + $ mkdir x + $ echo foo > x/a + + $ linkcp x y + $ echo bar >> y/a + +No diff if hardlink: + + $ diff x/a y/a + +Test mq hardlinking: + + $ echo "[extensions]" >> $HGRCPATH + $ echo "mq=" >> $HGRCPATH + + $ hg init a + $ cd a + + $ hg qimport -n foo - << EOF + > # HG changeset patch + > # Date 1 0 + > diff -r 2588a8b53d66 a + > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 + > +++ b/a Wed Jul 23 15:54:29 2008 +0200 + > @@ -0,0 +1,1 @@ + > +a + > EOF + adding foo to series file + + $ hg qpush + applying foo + now at: foo + + $ cd .. + $ linkcp a b + $ cd b + + $ hg qimport -n bar - << EOF + > # HG changeset patch + > # Date 2 0 + > diff -r 2588a8b53d66 a + > --- /dev/null Thu Jan 01 00:00:00 1970 +0000 + > +++ b/b Wed Jul 23 15:54:29 2008 +0200 + > @@ -0,0 +1,1 @@ + > +b + > EOF + adding bar to series file + + $ hg qpush + applying bar + now at: bar + + $ cat .hg/patches/status + 430ed4828a74fa4047bc816a25500f7472ab4bfe:foo + 4e7abb4840c46a910f6d7b4d3c3fc7e5209e684c:bar + + $ cat .hg/patches/series + foo + bar + + $ cat ../a/.hg/patches/status + 430ed4828a74fa4047bc816a25500f7472ab4bfe:foo + + $ cat ../a/.hg/patches/series + foo + +Test tags hardlinking: + + $ hg qdel -r qbase:qtip + patch foo finalized without changeset message + patch bar finalized without changeset message + + $ hg tag -l lfoo + $ hg tag foo + + $ cd .. + $ linkcp b c + $ cd c + + $ hg tag -l -r 0 lbar + $ hg tag -r 0 bar + + $ cat .hgtags + 4e7abb4840c46a910f6d7b4d3c3fc7e5209e684c foo + 430ed4828a74fa4047bc816a25500f7472ab4bfe bar + + $ cat .hg/localtags + 4e7abb4840c46a910f6d7b4d3c3fc7e5209e684c lfoo + 430ed4828a74fa4047bc816a25500f7472ab4bfe lbar + + $ cat ../b/.hgtags + 4e7abb4840c46a910f6d7b4d3c3fc7e5209e684c foo + + $ cat ../b/.hg/localtags + 4e7abb4840c46a910f6d7b4d3c3fc7e5209e684c lfoo + + $ cd .. diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hardlinks.t --- a/tests/test-hardlinks.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hardlinks.t Fri Mar 24 08:37:26 2017 -0700 @@ -166,7 +166,7 @@ 1 r2/.hg/store/00manifest.i 1 r2/.hg/store/data/d1/f2.i 2 r2/.hg/store/data/f1.i - 1 r2/.hg/store/fncache + [12] r2/\.hg/store/fncache (re) $ hg -R r2 verify checking changesets @@ -191,7 +191,7 @@ 1 r2/.hg/store/00manifest.i 1 r2/.hg/store/data/d1/f2.i 1 r2/.hg/store/data/f1.i - 1 r2/.hg/store/fncache + [12] r2/\.hg/store/fncache (re) $ cd r3 @@ -233,11 +233,11 @@ 2 r4/.hg/store/undo.backup.phaseroots 2 r4/.hg/store/undo.backupfiles 2 r4/.hg/store/undo.phaseroots - 2 r4/.hg/undo.backup.dirstate + [24] r4/\.hg/undo\.backup\.dirstate (re) 2 r4/.hg/undo.bookmarks 2 r4/.hg/undo.branch 2 r4/.hg/undo.desc - 2 r4/.hg/undo.dirstate + [24] r4/\.hg/undo\.dirstate (re) 2 r4/d1/data1 2 r4/d1/f2 2 r4/f1 @@ -272,11 +272,11 @@ 2 r4/.hg/store/undo.backup.phaseroots 2 r4/.hg/store/undo.backupfiles 2 r4/.hg/store/undo.phaseroots - 2 r4/.hg/undo.backup.dirstate + [24] r4/\.hg/undo\.backup\.dirstate (re) 2 r4/.hg/undo.bookmarks 2 r4/.hg/undo.branch 2 r4/.hg/undo.desc - 2 r4/.hg/undo.dirstate + [24] r4/\.hg/undo\.dirstate (re) 2 r4/d1/data1 2 r4/d1/f2 1 r4/f1 diff -r ed5b25874d99 -r 4baf79a77afa tests/test-help.t --- a/tests/test-help.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-help.t Fri Mar 24 08:37:26 2017 -0700 @@ -102,6 +102,7 @@ additional help topics: + color Colorizing Outputs config Configuration Files dates Date Formats diffs Diff Formats @@ -113,6 +114,7 @@ hgweb Configuring hgweb internals Technical implementation topics merge-tools Merge Tools + pager Pager Support patterns File Name Patterns phases Working with Phases revisions Specifying Revisions @@ -177,6 +179,7 @@ additional help topics: + color Colorizing Outputs config Configuration Files dates Date Formats diffs Diff Formats @@ -188,6 +191,7 @@ hgweb Configuring hgweb internals Technical implementation topics merge-tools Merge Tools + pager Pager Support patterns File Name Patterns phases Working with Phases revisions Specifying Revisions @@ -248,7 +252,6 @@ censor erase file content at a given revision churn command to display statistics about repository history clonebundles advertise pre-generated bundles to seed clones - color colorize output from some commands convert import revisions from foreign VCS repositories into Mercurial eol automatically manage newlines in repository files @@ -262,7 +265,6 @@ largefiles track large binary files mq manage a stack of patches notify hooks for sending email push notifications - pager browse command output with an external pager patchbomb command to send changesets as (a series of) patch emails purge command to delete untracked files from the working directory @@ -315,6 +317,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -326,6 +330,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) (use 'hg help' for the full list of commands) @@ -411,6 +417,8 @@ all prompts -q --quiet suppress output -v --verbose enable additional output + --color TYPE when to colorize (boolean, always, auto, never, or + debug) --config CONFIG [+] set/override config option (use 'section.name=value') --debug enable debugging output --debugger start debugger @@ -422,6 +430,8 @@ --version output version information and exit -h --help display help and exit --hidden consider hidden changesets + --pager TYPE when to paginate (boolean, always, auto, or never) + (default: auto) Test the textwidth config option @@ -678,6 +688,7 @@ > ('', 'newline', '', 'line1\nline2')], > 'hg nohelp', > norepo=True) + > @command('debugoptADV', [('', 'aopt', None, 'option is (ADVANCED)')]) > @command('debugoptDEP', [('', 'dopt', None, 'option is (DEPRECATED)')]) > @command('debugoptEXP', [('', 'eopt', None, 'option is (EXPERIMENTAL)')]) > def nohelp(ui, *args, **kwargs): @@ -816,6 +827,7 @@ additional help topics: + color Colorizing Outputs config Configuration Files dates Date Formats diffs Diff Formats @@ -827,6 +839,7 @@ hgweb Configuring hgweb internals Technical implementation topics merge-tools Merge Tools + pager Pager Support patterns File Name Patterns phases Working with Phases revisions Specifying Revisions @@ -853,6 +866,7 @@ debugbundle lists the contents of a bundle debugcheckstate validate the correctness of the current dirstate + debugcolor show available color, effects or style debugcommands list all available commands and options debugcomplete @@ -889,6 +903,7 @@ complete "names" - tags, open branch names, bookmark names debugobsolete create arbitrary obsolete marker + debugoptADV (no help text available) debugoptDEP (no help text available) debugoptEXP (no help text available) debugpathcomplete @@ -925,6 +940,7 @@ """"""""""""""""""""""""""""""" bundles Bundles + censor Censor changegroups Changegroups requirements Repository Requirements revlogs Revision Logs @@ -937,37 +953,51 @@ """""""""""" Changegroups are representations of repository revlog data, specifically - the changelog, manifest, and filelogs. + the changelog data, root/flat manifest data, treemanifest data, and + filelogs. There are 3 versions of changegroups: "1", "2", and "3". From a high- level, versions "1" and "2" are almost exactly the same, with the only - difference being a header on entries in the changeset segment. Version "3" - adds support for exchanging treemanifests and includes revlog flags in the - delta header. - - Changegroups consists of 3 logical segments: + difference being an additional item in the *delta header*. Version "3" + adds support for revlog flags in the *delta header* and optionally + exchanging treemanifests (enabled by setting an option on the + "changegroup" part in the bundle2). + + Changegroups when not exchanging treemanifests consist of 3 logical + segments: +---------------------------------+ | | | | | changeset | manifest | filelogs | | | | | + | | | | +---------------------------------+ + When exchanging treemanifests, there are 4 logical segments: + + +-------------------------------------------------+ + | | | | | + | changeset | root | treemanifests | filelogs | + | | manifest | | | + | | | | | + +-------------------------------------------------+ + The principle building block of each segment is a *chunk*. A *chunk* is a framed piece of data: +---------------------------------------+ | | | | length | data | - | (32 bits) | bytes | + | (4 bytes) | ( bytes) | | | | +---------------------------------------+ - Each chunk starts with a 32-bit big-endian signed integer indicating the - length of the raw data that follows. - - There is a special case chunk that has 0 length ("0x00000000"). We call - this an *empty chunk*. + All integers are big-endian signed integers. Each chunk starts with a + 32-bit integer indicating the length of the entire chunk (including the + length field itself). + + There is a special case chunk that has a value of 0 for the length + ("0x00000000"). We call this an *empty chunk*. Delta Groups ============ @@ -981,26 +1011,27 @@ +------------------------------------------------------------------------+ | | | | | | | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 | - | (32 bits) | (various) | (32 bits) | (various) | (32 bits) | + | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) | | | | | | | - +------------------------------------------------------------+-----------+ + +------------------------------------------------------------------------+ Each *chunk*'s data consists of the following: - +-----------------------------------------+ - | | | | - | delta header | mdiff header | delta | - | (various) | (12 bytes) | (various) | - | | | | - +-----------------------------------------+ - - The *length* field is the byte length of the remaining 3 logical pieces of - data. The *delta* is a diff from an existing entry in the changelog. + +---------------------------------------+ + | | | + | delta header | delta data | + | (various by version) | (various) | + | | | + +---------------------------------------+ + + The *delta data* is a series of *delta*s that describe a diff from an + existing entry (either that the recipient already has, or previously + specified in the bundlei/changegroup). The *delta header* is different between versions "1", "2", and "3" of the changegroup format. - Version 1: + Version 1 (headerlen=80): +------------------------------------------------------+ | | | | | @@ -1009,7 +1040,7 @@ | | | | | +------------------------------------------------------+ - Version 2: + Version 2 (headerlen=100): +------------------------------------------------------------------+ | | | | | | @@ -1018,30 +1049,36 @@ | | | | | | +------------------------------------------------------------------+ - Version 3: + Version 3 (headerlen=102): +------------------------------------------------------------------------------+ | | | | | | | - | node | p1 node | p2 node | base node | link node | flags | + | node | p1 node | p2 node | base node | link node | flags | | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) | | | | | | | | +------------------------------------------------------------------------------+ - The *mdiff header* consists of 3 32-bit big-endian signed integers - describing offsets at which to apply the following delta content: - - +-------------------------------------+ - | | | | - | offset | old length | new length | - | (32 bits) | (32 bits) | (32 bits) | - | | | | - +-------------------------------------+ + The *delta data* consists of "chunklen - 4 - headerlen" bytes, which + contain a series of *delta*s, densely packed (no separators). These deltas + describe a diff from an existing entry (either that the recipient already + has, or previously specified in the bundle/changegroup). The format is + described more fully in "hg help internals.bdiff", but briefly: + + +---------------------------------------------------------------+ + | | | | | + | start offset | end offset | new length | content | + | (4 bytes) | (4 bytes) | (4 bytes) | ( bytes) | + | | | | | + +---------------------------------------------------------------+ + + Please note that the length field in the delta data does *not* include + itself. In version 1, the delta is always applied against the previous node from the changegroup or the first parent if this is the first entry in the changegroup. - In version 2, the delta base node is encoded in the entry in the + In version 2 and up, the delta base node is encoded in the entry in the changegroup. This allows the delta to be expressed against any parent, which can result in smaller deltas and more efficient encoding of data. @@ -1049,46 +1086,61 @@ ================= The *changeset segment* consists of a single *delta group* holding - changelog data. It is followed by an *empty chunk* to denote the boundary - to the *manifests segment*. + changelog data. The *empty chunk* at the end of the *delta group* denotes + the boundary to the *manifest segment*. Manifest Segment ================ The *manifest segment* consists of a single *delta group* holding manifest - data. It is followed by an *empty chunk* to denote the boundary to the - *filelogs segment*. + data. If treemanifests are in use, it contains only the manifest for the + root directory of the repository. Otherwise, it contains the entire + manifest data. The *empty chunk* at the end of the *delta group* denotes + the boundary to the next segment (either the *treemanifests segment* or + the *filelogs segment*, depending on version and the request options). + + Treemanifests Segment + --------------------- + + The *treemanifests segment* only exists in changegroup version "3", and + only if the 'treemanifest' param is part of the bundle2 changegroup part + (it is not possible to use changegroup version 3 outside of bundle2). + Aside from the filenames in the *treemanifests segment* containing a + trailing "/" character, it behaves identically to the *filelogs segment* + (see below). The final sub-segment is followed by an *empty chunk* + (logically, a sub-segment with filename size 0). This denotes the boundary + to the *filelogs segment*. Filelogs Segment ================ - The *filelogs* segment consists of multiple sub-segments, each + The *filelogs segment* consists of multiple sub-segments, each corresponding to an individual file whose data is being described: - +--------------------------------------+ - | | | | | - | filelog0 | filelog1 | filelog2 | ... | - | | | | | - +--------------------------------------+ - - In version "3" of the changegroup format, filelogs may include directory - logs when treemanifests are in use. directory logs are identified by - having a trailing '/' on their filename (see below). - - The final filelog sub-segment is followed by an *empty chunk* to denote - the end of the segment and the overall changegroup. + +--------------------------------------------------+ + | | | | | | + | filelog0 | filelog1 | filelog2 | ... | 0x0 | + | | | | | (4 bytes) | + | | | | | | + +--------------------------------------------------+ + + The final filelog sub-segment is followed by an *empty chunk* (logically, + a sub-segment with filename size 0). This denotes the end of the segment + and of the overall changegroup. Each filelog sub-segment consists of the following: - +------------------------------------------+ - | | | | - | filename size | filename | delta group | - | (32 bits) | (various) | (various) | - | | | | - +------------------------------------------+ + +------------------------------------------------------+ + | | | | + | filename length | filename | delta group | + | (4 bytes) | ( bytes) | (various) | + | | | | + +------------------------------------------------------+ That is, a *chunk* consisting of the filename (not terminated or padded) - followed by N chunks constituting the *delta group* for this file. + followed by N chunks constituting the *delta group* for this file. The + *empty chunk* at the end of each *delta group* denotes the boundary to the + next filelog sub-segment. Test list of commands with command with no help text @@ -1102,7 +1154,15 @@ (use 'hg help -v helpext' to show built-in aliases and global options) -test deprecated and experimental options are hidden in command help +test advanced, deprecated and experimental options are hidden in command help + $ hg help debugoptADV + hg debugoptADV + + (no help text available) + + options: + + (some details hidden, use --verbose to show complete help) $ hg help debugoptDEP hg debugoptDEP @@ -1121,7 +1181,9 @@ (some details hidden, use --verbose to show complete help) -test deprecated and experimental options is shown with -v +test advanced, deprecated and experimental options are shown with -v + $ hg help -v debugoptADV | grep aopt + --aopt option is (ADVANCED) $ hg help -v debugoptDEP | grep dopt --dopt option is (DEPRECATED) $ hg help -v debugoptEXP | grep eopt @@ -1547,11 +1609,11 @@ "default:pushurl" should be used instead. $ hg help glossary.mcguffin - abort: help section not found + abort: help section not found: glossary.mcguffin [255] $ hg help glossary.mc.guffin - abort: help section not found + abort: help section not found: glossary.mc.guffin [255] $ hg help template.files @@ -1792,7 +1854,7 @@ $ hg serve -R "$TESTTMP/test" -n test -p $HGPORT -d --pid-file=hg.pid $ cat hg.pid >> $DAEMON_PIDS - $ get-with-headers.py 127.0.0.1:$HGPORT "help" + $ get-with-headers.py $LOCALIP:$HGPORT "help" 200 Script output follows @@ -1837,6 +1899,13 @@

Topics

+ + color + + + Colorizing Outputs + + config @@ -1914,6 +1983,13 @@ Merge Tools + + pager + + + Pager Support + + patterns @@ -2361,7 +2437,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT "help/add" + $ get-with-headers.py $LOCALIP:$HGPORT "help/add" 200 Script output follows @@ -2491,6 +2567,9 @@ --verbose enable additional output + --color TYPE + when to colorize (boolean, always, auto, never, or debug) + --config CONFIG [+] set/override config option (use 'section.name=value') @@ -2523,6 +2602,9 @@ --hidden consider hidden changesets + + --pager TYPE + when to paginate (boolean, always, auto, or never) (default: auto) @@ -2535,7 +2617,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT "help/remove" + $ get-with-headers.py $LOCALIP:$HGPORT "help/remove" 200 Script output follows @@ -2686,6 +2768,9 @@ --verbose enable additional output + --color TYPE + when to colorize (boolean, always, auto, never, or debug) + --config CONFIG [+] set/override config option (use 'section.name=value') @@ -2718,6 +2803,9 @@ --hidden consider hidden changesets + + --pager TYPE + when to paginate (boolean, always, auto, or never) (default: auto) @@ -2730,7 +2818,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT "help/dates" + $ get-with-headers.py $LOCALIP:$HGPORT "help/dates" 200 Script output follows @@ -2837,7 +2925,7 @@ Sub-topic indexes rendered properly - $ get-with-headers.py 127.0.0.1:$HGPORT "help/internals" + $ get-with-headers.py $LOCALIP:$HGPORT "help/internals" 200 Script output follows @@ -2889,6 +2977,13 @@ Bundles + + censor + + + Censor + + changegroups @@ -2933,7 +3028,7 @@ Sub-topic topics rendered properly - $ get-with-headers.py 127.0.0.1:$HGPORT "help/internals.changegroups" + $ get-with-headers.py $LOCALIP:$HGPORT "help/internals.changegroups" 200 Script output follows @@ -2980,26 +3075,41 @@

Changegroups

Changegroups are representations of repository revlog data, specifically - the changelog, manifest, and filelogs. + the changelog data, root/flat manifest data, treemanifest data, and + filelogs.

There are 3 versions of changegroups: "1", "2", and "3". From a - high-level, versions "1" and "2" are almost exactly the same, with - the only difference being a header on entries in the changeset - segment. Version "3" adds support for exchanging treemanifests and - includes revlog flags in the delta header. + high-level, versions "1" and "2" are almost exactly the same, with the + only difference being an additional item in the *delta header*. Version + "3" adds support for revlog flags in the *delta header* and optionally + exchanging treemanifests (enabled by setting an option on the + "changegroup" part in the bundle2).

- Changegroups consists of 3 logical segments: + Changegroups when not exchanging treemanifests consist of 3 logical + segments:

   +---------------------------------+
   |           |          |          |
   | changeset | manifest | filelogs |
   |           |          |          |
+  |           |          |          |
   +---------------------------------+
   

+ When exchanging treemanifests, there are 4 logical segments: +

+
+  +-------------------------------------------------+
+  |           |          |               |          |
+  | changeset |   root   | treemanifests | filelogs |
+  |           | manifest |               |          |
+  |           |          |               |          |
+  +-------------------------------------------------+
+  
+

The principle building block of each segment is a *chunk*. A *chunk* is a framed piece of data:

@@ -3007,17 +3117,18 @@ +---------------------------------------+ | | | | length | data | - | (32 bits) | <length> bytes | + | (4 bytes) | (<length - 4> bytes) | | | | +---------------------------------------+

- Each chunk starts with a 32-bit big-endian signed integer indicating - the length of the raw data that follows. + All integers are big-endian signed integers. Each chunk starts with a 32-bit + integer indicating the length of the entire chunk (including the length field + itself).

- There is a special case chunk that has 0 length ("0x00000000"). We - call this an *empty chunk*. + There is a special case chunk that has a value of 0 for the length + ("0x00000000"). We call this an *empty chunk*.

Delta Groups

@@ -3032,31 +3143,32 @@ +------------------------------------------------------------------------+ | | | | | | | chunk0 length | chunk0 data | chunk1 length | chunk1 data | 0x0 | - | (32 bits) | (various) | (32 bits) | (various) | (32 bits) | + | (4 bytes) | (various) | (4 bytes) | (various) | (4 bytes) | | | | | | | - +------------------------------------------------------------+-----------+ + +------------------------------------------------------------------------+

Each *chunk*'s data consists of the following:

-  +-----------------------------------------+
-  |              |              |           |
-  | delta header | mdiff header |   delta   |
-  |  (various)   |  (12 bytes)  | (various) |
-  |              |              |           |
-  +-----------------------------------------+
+  +---------------------------------------+
+  |                        |              |
+  |     delta header       |  delta data  |
+  |  (various by version)  |  (various)   |
+  |                        |              |
+  +---------------------------------------+
   

- The *length* field is the byte length of the remaining 3 logical pieces - of data. The *delta* is a diff from an existing entry in the changelog. + The *delta data* is a series of *delta*s that describe a diff from an existing + entry (either that the recipient already has, or previously specified in the + bundlei/changegroup).

The *delta header* is different between versions "1", "2", and "3" of the changegroup format.

- Version 1: + Version 1 (headerlen=80):

   +------------------------------------------------------+
@@ -3067,7 +3179,7 @@
   +------------------------------------------------------+
   

- Version 2: + Version 2 (headerlen=100):

   +------------------------------------------------------------------+
@@ -3078,85 +3190,104 @@
   +------------------------------------------------------------------+
   

- Version 3: + Version 3 (headerlen=102):

   +------------------------------------------------------------------------------+
   |            |             |             |            |            |           |
-  |    node    |   p1 node   |   p2 node   | base node  | link node  | flags     |
+  |    node    |   p1 node   |   p2 node   | base node  | link node  |   flags   |
   | (20 bytes) |  (20 bytes) |  (20 bytes) | (20 bytes) | (20 bytes) | (2 bytes) |
   |            |             |             |            |            |           |
   +------------------------------------------------------------------------------+
   

- The *mdiff header* consists of 3 32-bit big-endian signed integers - describing offsets at which to apply the following delta content: + The *delta data* consists of "chunklen - 4 - headerlen" bytes, which contain a + series of *delta*s, densely packed (no separators). These deltas describe a diff + from an existing entry (either that the recipient already has, or previously + specified in the bundle/changegroup). The format is described more fully in + "hg help internals.bdiff", but briefly:

-  +-------------------------------------+
-  |           |            |            |
-  |  offset   | old length | new length |
-  | (32 bits) |  (32 bits) |  (32 bits) |
-  |           |            |            |
-  +-------------------------------------+
+  +---------------------------------------------------------------+
+  |              |            |            |                      |
+  | start offset | end offset | new length |        content       |
+  |  (4 bytes)   |  (4 bytes) |  (4 bytes) | (<new length> bytes) |
+  |              |            |            |                      |
+  +---------------------------------------------------------------+
   

+ Please note that the length field in the delta data does *not* include itself. +

+

In version 1, the delta is always applied against the previous node from the changegroup or the first parent if this is the first entry in the changegroup.

- In version 2, the delta base node is encoded in the entry in the + In version 2 and up, the delta base node is encoded in the entry in the changegroup. This allows the delta to be expressed against any parent, which can result in smaller deltas and more efficient encoding of data.

Changeset Segment

The *changeset segment* consists of a single *delta group* holding - changelog data. It is followed by an *empty chunk* to denote the - boundary to the *manifests segment*. + changelog data. The *empty chunk* at the end of the *delta group* denotes + the boundary to the *manifest segment*.

Manifest Segment

- The *manifest segment* consists of a single *delta group* holding - manifest data. It is followed by an *empty chunk* to denote the boundary - to the *filelogs segment*. + The *manifest segment* consists of a single *delta group* holding manifest + data. If treemanifests are in use, it contains only the manifest for the + root directory of the repository. Otherwise, it contains the entire + manifest data. The *empty chunk* at the end of the *delta group* denotes + the boundary to the next segment (either the *treemanifests segment* or the + *filelogs segment*, depending on version and the request options). +

+

Treemanifests Segment

+

+ The *treemanifests segment* only exists in changegroup version "3", and + only if the 'treemanifest' param is part of the bundle2 changegroup part + (it is not possible to use changegroup version 3 outside of bundle2). + Aside from the filenames in the *treemanifests segment* containing a + trailing "/" character, it behaves identically to the *filelogs segment* + (see below). The final sub-segment is followed by an *empty chunk* (logically, + a sub-segment with filename size 0). This denotes the boundary to the + *filelogs segment*.

Filelogs Segment

- The *filelogs* segment consists of multiple sub-segments, each + The *filelogs segment* consists of multiple sub-segments, each corresponding to an individual file whose data is being described:

-  +--------------------------------------+
-  |          |          |          |     |
-  | filelog0 | filelog1 | filelog2 | ... |
-  |          |          |          |     |
-  +--------------------------------------+
+  +--------------------------------------------------+
+  |          |          |          |     |           |
+  | filelog0 | filelog1 | filelog2 | ... |    0x0    |
+  |          |          |          |     | (4 bytes) |
+  |          |          |          |     |           |
+  +--------------------------------------------------+
   

- In version "3" of the changegroup format, filelogs may include - directory logs when treemanifests are in use. directory logs are - identified by having a trailing '/' on their filename (see below). -

-

- The final filelog sub-segment is followed by an *empty chunk* to denote - the end of the segment and the overall changegroup. + The final filelog sub-segment is followed by an *empty chunk* (logically, + a sub-segment with filename size 0). This denotes the end of the segment + and of the overall changegroup.

Each filelog sub-segment consists of the following:

-  +------------------------------------------+
-  |               |            |             |
-  | filename size |  filename  | delta group |
-  |   (32 bits)   |  (various) |  (various)  |
-  |               |            |             |
-  +------------------------------------------+
+  +------------------------------------------------------+
+  |                 |                      |             |
+  | filename length |       filename       | delta group |
+  |    (4 bytes)    | (<length - 4> bytes) |  (various)  |
+  |                 |                      |             |
+  +------------------------------------------------------+
   

That is, a *chunk* consisting of the filename (not terminated or padded) - followed by N chunks constituting the *delta group* for this file. + followed by N chunks constituting the *delta group* for this file. The + *empty chunk* at the end of each *delta group* denotes the boundary to the + next filelog sub-segment.

diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-commands.t --- a/tests/test-hgweb-commands.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-commands.t Fri Mar 24 08:37:26 2017 -0700 @@ -58,7 +58,7 @@ Logs and changes - $ get-with-headers.py 127.0.0.1:$HGPORT 'log/?style=atom' + $ get-with-headers.py $LOCALIP:$HGPORT 'log/?style=atom' 200 Script output follows @@ -244,7 +244,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'log/?style=rss' + $ get-with-headers.py $LOCALIP:$HGPORT 'log/?style=rss' 200 Script output follows @@ -422,7 +422,7 @@ (no-eol) - $ get-with-headers.py 127.0.0.1:$HGPORT 'log/1/?style=atom' + $ get-with-headers.py $LOCALIP:$HGPORT 'log/1/?style=atom' 200 Script output follows @@ -522,7 +522,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'log/1/?style=rss' + $ get-with-headers.py $LOCALIP:$HGPORT 'log/1/?style=rss' 200 Script output follows @@ -618,7 +618,7 @@ (no-eol) - $ get-with-headers.py 127.0.0.1:$HGPORT 'log/1/foo/?style=atom' + $ get-with-headers.py $LOCALIP:$HGPORT 'log/1/foo/?style=atom' 200 Script output follows @@ -673,7 +673,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'log/1/foo/?style=rss' + $ get-with-headers.py $LOCALIP:$HGPORT 'log/1/foo/?style=rss' 200 Script output follows @@ -694,7 +694,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'shortlog/' + $ get-with-headers.py $LOCALIP:$HGPORT 'shortlog/' 200 Script output follows @@ -834,7 +834,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'rev/0/' + $ get-with-headers.py $LOCALIP:$HGPORT 'rev/0/' 200 Script output follows @@ -965,7 +965,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'rev/1/?style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'rev/1/?style=raw' 200 Script output follows @@ -982,7 +982,7 @@ @@ -0,0 +1,1 @@ +2ef0ac749a14e4f57a5a822464a0902c6f7f448f 1.0 - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=base' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=base' 200 Script output follows @@ -1071,12 +1071,12 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=stable&style=raw' | grep 'revision:' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=stable&style=raw' | grep 'revision:' revision: 2 Search with revset syntax - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=tip^&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=tip^&style=raw' 200 Script output follows @@ -1093,7 +1093,7 @@ branch: stable - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=last(all(),2)^&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=last(all(),2)^&style=raw' 200 Script output follows @@ -1117,7 +1117,7 @@ branch: default - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=last(all(,2)^&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=last(all(,2)^&style=raw' 200 Script output follows @@ -1127,7 +1127,7 @@ # Mode literal keyword search - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=last(al(),2)^&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=last(al(),2)^&style=raw' 200 Script output follows @@ -1137,7 +1137,7 @@ # Mode literal keyword search - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=bookmark(anotherthing)&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=bookmark(anotherthing)&style=raw' 200 Script output follows @@ -1155,7 +1155,7 @@ bookmark: anotherthing - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=bookmark(abc)&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=bookmark(abc)&style=raw' 200 Script output follows @@ -1165,7 +1165,7 @@ # Mode literal keyword search - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=deadbeef:&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=deadbeef:&style=raw' 200 Script output follows @@ -1176,7 +1176,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=user("test")&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=user("test")&style=raw' 200 Script output follows @@ -1217,7 +1217,7 @@ bookmark: anotherthing - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=user("re:test")&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=user("re:test")&style=raw' 200 Script output follows @@ -1230,11 +1230,11 @@ File-related - $ get-with-headers.py 127.0.0.1:$HGPORT 'file/1/foo/?style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'file/1/foo/?style=raw' 200 Script output follows foo - $ get-with-headers.py 127.0.0.1:$HGPORT 'annotate/1/foo/?style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'annotate/1/foo/?style=raw' 200 Script output follows @@ -1243,7 +1243,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'file/1/?style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'file/1/?style=raw' 200 Script output follows @@ -1259,7 +1259,7 @@ $ hg parents --template "{node|short}\n" -r 1 foo 2ef0ac749a14 - $ get-with-headers.py 127.0.0.1:$HGPORT 'file/1/foo' + $ get-with-headers.py $LOCALIP:$HGPORT 'file/1/foo' 200 Script output follows @@ -1354,7 +1354,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'filediff/0/foo/?style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'filediff/0/foo/?style=raw' 200 Script output follows @@ -1368,7 +1368,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'filediff/1/foo/?style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'filediff/1/foo/?style=raw' 200 Script output follows @@ -1384,7 +1384,7 @@ $ hg parents --template "{node|short}\n" -r 2 foo 2ef0ac749a14 - $ get-with-headers.py 127.0.0.1:$HGPORT 'file/2/foo' + $ get-with-headers.py $LOCALIP:$HGPORT 'file/2/foo' 200 Script output follows @@ -1483,23 +1483,23 @@ Overviews - $ get-with-headers.py 127.0.0.1:$HGPORT 'raw-tags' + $ get-with-headers.py $LOCALIP:$HGPORT 'raw-tags' 200 Script output follows tip cad8025a2e87f88c06259790adfa15acb4080123 1.0 2ef0ac749a14e4f57a5a822464a0902c6f7f448f - $ get-with-headers.py 127.0.0.1:$HGPORT 'raw-branches' + $ get-with-headers.py $LOCALIP:$HGPORT 'raw-branches' 200 Script output follows unstable cad8025a2e87f88c06259790adfa15acb4080123 open stable 1d22e65f027e5a0609357e7d8e7508cd2ba5d2fe inactive default a4f92ed23982be056b9852de5dfe873eaac7f0de inactive - $ get-with-headers.py 127.0.0.1:$HGPORT 'raw-bookmarks' + $ get-with-headers.py $LOCALIP:$HGPORT 'raw-bookmarks' 200 Script output follows something cad8025a2e87f88c06259790adfa15acb4080123 anotherthing 2ef0ac749a14e4f57a5a822464a0902c6f7f448f - $ get-with-headers.py 127.0.0.1:$HGPORT 'summary/?style=gitweb' + $ get-with-headers.py $LOCALIP:$HGPORT 'summary/?style=gitweb' 200 Script output follows @@ -1697,7 +1697,7 @@ - $ get-with-headers.py 127.0.0.1:$HGPORT 'graph/?style=gitweb' + $ get-with-headers.py $LOCALIP:$HGPORT 'graph/?style=gitweb' 200 Script output follows @@ -1843,7 +1843,7 @@ raw graph - $ get-with-headers.py 127.0.0.1:$HGPORT 'graph/?style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'graph/?style=raw' 200 Script output follows @@ -1893,28 +1893,28 @@ capabilities - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=capabilities'; echo + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=capabilities'; echo 200 Script output follows lookup changegroupsubset branchmap pushkey known getbundle unbundlehash batch bundle2=HG20%0Achangegroup%3D01%2C02%0Adigests%3Dmd5%2Csha1%2Csha512%0Aerror%3Dabort%2Cunsupportedcontent%2Cpushraced%2Cpushkey%0Ahgtagsfnodes%0Alistkeys%0Apushkey%0Aremote-changegroup%3Dhttp%2Chttps unbundle=HG10GZ,HG10BZ,HG10UN httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx compression=*zlib (glob) heads - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=heads' + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=heads' 200 Script output follows cad8025a2e87f88c06259790adfa15acb4080123 branches - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=branches&nodes=0000000000000000000000000000000000000000' + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=branches&nodes=0000000000000000000000000000000000000000' 200 Script output follows 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000 0000000000000000000000000000000000000000 changegroup - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=changegroup&roots=0000000000000000000000000000000000000000' + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=changegroup&roots=0000000000000000000000000000000000000000' 200 Script output follows x\x9c\xbd\x94MHTQ\x14\xc7'+\x9d\xc66\x81\x89P\xc1\xa3\x14\xcct\xba\xef\xbe\xfb\xde\xbb\xcfr0\xb3"\x02\x11[%\x98\xdcO\xa7\xd2\x19\x98y\xd2\x07h"\x96\xa0e\xda\xa6lUY-\xca\x08\xa2\x82\x16\x96\xd1\xa2\xf0#\xc8\x95\x1b\xdd$!m*"\xc8\x82\xea\xbe\x9c\x01\x85\xc9\x996\x1d\xf8\xc1\xe3~\x9d\xff9\xef\x7f\xaf\xcf\xe7\xbb\x19\xfc4\xec^\xcb\x9b\xfbz\xa6\xbe\xb3\x90_\xef/\x8d\x9e\xad\xbe\xe4\xcb0\xd2\xec\xad\x12X:\xc8\x12\x12\xd9:\x95\xba \x1cG\xb7$\xc5\xc44\x1c(\x1d\x03\x03\xdb\x84\x0cK#\xe0\x8a\xb8\x1b\x00\x1a\x08p\xb2SF\xa3\x01\x8f\x00%q\xa1Ny{k!8\xe5t>[{\xe2j\xddl\xc3\xcf\xee\xd0\xddW\x9ff3U\x9djobj\xbb\x87E\x88\x05l\x001\x12\x18\x13\xc6 \xb7(\xe3\x02a\x80\x81\xcel.u\x9b\x1b\x8c\x91\x80Z\x0c\x15\x15 (esc) @@ -1925,14 +1925,14 @@ stream_out - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=stream_out' + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=stream_out' 200 Script output follows 1 failing unbundle, requires POST request - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=unbundle' + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=unbundle' 405 push requires POST request 0 @@ -1941,7 +1941,7 @@ Static files - $ get-with-headers.py 127.0.0.1:$HGPORT 'static/style.css' + $ get-with-headers.py $LOCALIP:$HGPORT 'static/style.css' 200 Script output follows a { text-decoration:none; } @@ -2077,7 +2077,7 @@ > --cwd .. -R `pwd` $ cat hg.pid >> $DAEMON_PIDS - $ get-with-headers.py 127.0.0.1:$HGPORT 'log?rev=adds("foo")&style=raw' + $ get-with-headers.py $LOCALIP:$HGPORT 'log?rev=adds("foo")&style=raw' 200 Script output follows @@ -2110,7 +2110,7 @@ Graph json escape of multibyte character - $ get-with-headers.py 127.0.0.1:$HGPORT 'graph/' > out + $ get-with-headers.py $LOCALIP:$HGPORT 'graph/' > out >>> from __future__ import print_function >>> for line in open("out"): ... if line.startswith("var data ="): @@ -2121,14 +2121,14 @@ (plain version to check the format) - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=capabilities' | dd ibs=75 count=1 2> /dev/null; echo + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=capabilities' | dd ibs=75 count=1 2> /dev/null; echo 200 Script output follows lookup changegroupsubset branchmap pushkey known (spread version to check the content) - $ get-with-headers.py 127.0.0.1:$HGPORT '?cmd=capabilities' | tr ' ' '\n'; echo + $ get-with-headers.py $LOCALIP:$HGPORT '?cmd=capabilities' | tr ' ' '\n'; echo 200 Script output @@ -2194,23 +2194,23 @@ Test paging - $ get-with-headers.py 127.0.0.1:$HGPORT \ + $ get-with-headers.py $LOCALIP:$HGPORT \ > 'graph/?style=raw' | grep changeset changeset: aed2d9c1d0e7 changeset: b60a39a85a01 - $ get-with-headers.py 127.0.0.1:$HGPORT \ + $ get-with-headers.py $LOCALIP:$HGPORT \ > 'graph/?style=raw&revcount=3' | grep changeset changeset: aed2d9c1d0e7 changeset: b60a39a85a01 changeset: ada793dcc118 - $ get-with-headers.py 127.0.0.1:$HGPORT \ + $ get-with-headers.py $LOCALIP:$HGPORT \ > 'graph/e06180cbfb0?style=raw&revcount=3' | grep changeset changeset: e06180cbfb0c changeset: b4e73ffab476 - $ get-with-headers.py 127.0.0.1:$HGPORT \ + $ get-with-headers.py $LOCALIP:$HGPORT \ > 'graph/b4e73ffab47?style=raw&revcount=3' | grep changeset changeset: b4e73ffab476 diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-descend-empties.t --- a/tests/test-hgweb-descend-empties.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-descend-empties.t Fri Mar 24 08:37:26 2017 -0700 @@ -29,7 +29,7 @@ manifest with descending (paper) - $ get-with-headers.py 127.0.0.1:$HGPORT 'file' + $ get-with-headers.py $LOCALIP:$HGPORT 'file' 200 Script output follows @@ -147,7 +147,7 @@ manifest with descending (coal) - $ get-with-headers.py 127.0.0.1:$HGPORT 'file?style=coal' + $ get-with-headers.py $LOCALIP:$HGPORT 'file?style=coal' 200 Script output follows @@ -266,7 +266,7 @@ manifest with descending (monoblue) - $ get-with-headers.py 127.0.0.1:$HGPORT 'file?style=monoblue' + $ get-with-headers.py $LOCALIP:$HGPORT 'file?style=monoblue' 200 Script output follows @@ -379,7 +379,7 @@ manifest with descending (gitweb) - $ get-with-headers.py 127.0.0.1:$HGPORT 'file?style=gitweb' + $ get-with-headers.py $LOCALIP:$HGPORT 'file?style=gitweb' 200 Script output follows @@ -482,7 +482,7 @@ manifest with descending (spartan) - $ get-with-headers.py 127.0.0.1:$HGPORT 'file?style=spartan' + $ get-with-headers.py $LOCALIP:$HGPORT 'file?style=spartan' 200 Script output follows diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-json.t --- a/tests/test-hgweb-json.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-json.t Fri Mar 24 08:37:26 2017 -0700 @@ -1549,6 +1549,10 @@ ], "topics": [ { + "summary": "Colorizing Outputs", + "topic": "color" + }, + { "summary": "Configuration Files", "topic": "config" }, @@ -1593,6 +1597,10 @@ "topic": "merge-tools" }, { + "summary": "Pager Support", + "topic": "pager" + }, + { "summary": "File Name Patterns", "topic": "patterns" }, diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-no-path-info.t --- a/tests/test-hgweb-no-path-info.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-no-path-info.t Fri Mar 24 08:37:26 2017 -0700 @@ -49,7 +49,7 @@ > 'REQUEST_METHOD': 'GET', > 'PATH_INFO': '/', > 'SCRIPT_NAME': '', - > 'SERVER_NAME': '127.0.0.1', + > 'SERVER_NAME': '$LOCALIP', > 'SERVER_PORT': os.environ['HGPORT'], > 'SERVER_PROTOCOL': 'HTTP/1.0' > } @@ -79,16 +79,16 @@ - http://127.0.0.1:$HGPORT/ (glob) - (glob) - (glob) + http://$LOCALIP:$HGPORT/ (glob) + (glob) + (glob) repo Changelog 1970-01-01T00:00:00+00:00 [default] test - http://127.0.0.1:$HGPORT/#changeset-61c9426e69fef294feed5e2bbfc97d39944a5b1c (glob) - (glob) + http://$LOCALIP:$HGPORT/#changeset-61c9426e69fef294feed5e2bbfc97d39944a5b1c (glob) + (glob) test test diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-no-request-uri.t --- a/tests/test-hgweb-no-request-uri.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-no-request-uri.t Fri Mar 24 08:37:26 2017 -0700 @@ -48,7 +48,7 @@ > 'wsgi.run_once': False, > 'REQUEST_METHOD': 'GET', > 'SCRIPT_NAME': '', - > 'SERVER_NAME': '127.0.0.1', + > 'SERVER_NAME': '$LOCALIP', > 'SERVER_PORT': os.environ['HGPORT'], > 'SERVER_PROTOCOL': 'HTTP/1.0' > } @@ -90,16 +90,16 @@ - http://127.0.0.1:$HGPORT/ (glob) - (glob) - (glob) + http://$LOCALIP:$HGPORT/ (glob) + (glob) + (glob) repo Changelog 1970-01-01T00:00:00+00:00 [default] test - http://127.0.0.1:$HGPORT/#changeset-61c9426e69fef294feed5e2bbfc97d39944a5b1c (glob) - (glob) + http://$LOCALIP:$HGPORT/#changeset-61c9426e69fef294feed5e2bbfc97d39944a5b1c (glob) + (glob) test test diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-non-interactive.t --- a/tests/test-hgweb-non-interactive.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-non-interactive.t Fri Mar 24 08:37:26 2017 -0700 @@ -60,7 +60,7 @@ > 'SCRIPT_NAME': '', > 'PATH_INFO': '', > 'QUERY_STRING': '', - > 'SERVER_NAME': '127.0.0.1', + > 'SERVER_NAME': '$LOCALIP', > 'SERVER_PORT': os.environ['HGPORT'], > 'SERVER_PROTOCOL': 'HTTP/1.0' > } diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-raw.t --- a/tests/test-hgweb-raw.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-raw.t Fri Mar 24 08:37:26 2017 -0700 @@ -32,7 +32,7 @@ It is very boring to read, but computers don't care about things like that. $ cat access.log error.log - 127.0.0.1 - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw HTTP/1.1" 200 - (glob) + $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw HTTP/1.1" 200 - (glob) $ rm access.log error.log $ hg serve -p $HGPORT -A access.log -E error.log -d --pid-file=hg.pid \ @@ -53,6 +53,6 @@ It is very boring to read, but computers don't care about things like that. $ cat access.log error.log - 127.0.0.1 - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw HTTP/1.1" 200 - (glob) + $LOCALIP - - [*] "GET /?f=bf0ff59095c9;file=sub/some%20text%25.txt;style=raw HTTP/1.1" 200 - (glob) $ cd .. diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgweb-symrev.t --- a/tests/test-hgweb-symrev.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgweb-symrev.t Fri Mar 24 08:37:26 2017 -0700 @@ -37,7 +37,7 @@ (De)referencing symbolic revisions (paper) - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=paper' | egrep $REVLINKS
  • graph
  • changeset
  • browse
  • @@ -52,7 +52,7 @@ more | rev 2: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph?style=paper' | egrep $REVLINKS
  • log
  • changeset
  • browse
  • @@ -63,7 +63,7 @@ more | rev 2: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file?style=paper' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -74,24 +74,24 @@ - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'branches?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'branches?style=paper' | egrep $REVLINKS - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'tags?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'tags?style=paper' | egrep $REVLINKS - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'bookmarks?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'bookmarks?style=paper' | egrep $REVLINKS - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=paper&rev=all()' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=paper&rev=all()' | egrep $REVLINKS third second first - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'rev/xyzzy?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'rev/xyzzy?style=paper' | egrep $REVLINKS
  • log
  • graph
  • raw
  • @@ -102,7 +102,7 @@ 9d8c40cba617 foo - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog/xyzzy?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog/xyzzy?style=paper' | egrep $REVLINKS
  • graph
  • changeset
  • browse
  • @@ -116,7 +116,7 @@ more | rev 1: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph/xyzzy?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph/xyzzy?style=paper' | egrep $REVLINKS
  • log
  • changeset
  • browse
  • @@ -127,7 +127,7 @@ more | rev 1: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy?style=paper' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -138,7 +138,7 @@ - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy/foo?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy/foo?style=paper' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -153,7 +153,7 @@ 43c799df6e75 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy/foo?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy/foo?style=paper' | egrep $REVLINKS href="/atom-log/tip/foo" title="Atom feed for test:foo" /> href="/rss-log/tip/foo" title="RSS feed for test:foo" />
  • log
  • @@ -176,7 +176,7 @@ more | (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'annotate/xyzzy/foo?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'annotate/xyzzy/foo?style=paper' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -200,7 +200,7 @@ diff changeset - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'diff/xyzzy/foo?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'diff/xyzzy/foo?style=paper' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -215,7 +215,7 @@ 43c799df6e75 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'comparison/xyzzy/foo?style=paper' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'comparison/xyzzy/foo?style=paper' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -232,7 +232,7 @@ (De)referencing symbolic revisions (coal) - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=coal' | egrep $REVLINKS
  • graph
  • changeset
  • browse
  • @@ -247,7 +247,7 @@ more | rev 2: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph?style=coal' | egrep $REVLINKS
  • log
  • changeset
  • browse
  • @@ -258,7 +258,7 @@ more | rev 2: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file?style=coal' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -269,24 +269,24 @@ - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'branches?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'branches?style=coal' | egrep $REVLINKS - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'tags?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'tags?style=coal' | egrep $REVLINKS - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'bookmarks?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'bookmarks?style=coal' | egrep $REVLINKS - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=coal&rev=all()' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=coal&rev=all()' | egrep $REVLINKS third second first - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'rev/xyzzy?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'rev/xyzzy?style=coal' | egrep $REVLINKS
  • log
  • graph
  • raw
  • @@ -297,7 +297,7 @@ 9d8c40cba617 foo - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog/xyzzy?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog/xyzzy?style=coal' | egrep $REVLINKS
  • graph
  • changeset
  • browse
  • @@ -311,7 +311,7 @@ more | rev 1: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph/xyzzy?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph/xyzzy?style=coal' | egrep $REVLINKS
  • log
  • changeset
  • browse
  • @@ -322,7 +322,7 @@ more | rev 1: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy?style=coal' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -333,7 +333,7 @@ - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy/foo?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy/foo?style=coal' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -348,7 +348,7 @@ 43c799df6e75 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy/foo?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy/foo?style=coal' | egrep $REVLINKS href="/atom-log/tip/foo" title="Atom feed for test:foo" /> href="/rss-log/tip/foo" title="RSS feed for test:foo" />
  • log
  • @@ -371,7 +371,7 @@ more | (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'annotate/xyzzy/foo?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'annotate/xyzzy/foo?style=coal' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -395,7 +395,7 @@ diff changeset - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'diff/xyzzy/foo?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'diff/xyzzy/foo?style=coal' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -410,7 +410,7 @@ 43c799df6e75 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'comparison/xyzzy/foo?style=coal' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'comparison/xyzzy/foo?style=coal' | egrep $REVLINKS
  • log
  • graph
  • changeset
  • @@ -427,7 +427,7 @@ (De)referencing symbolic revisions (gitweb) - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'summary?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'summary?style=gitweb' | egrep $REVLINKS files | zip | changeset | @@ -447,7 +447,7 @@ changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=gitweb' | egrep $REVLINKS changelog | graph | files | zip | @@ -463,7 +463,7 @@ files (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log?style=gitweb' | egrep $REVLINKS shortlog | graph | files | zip | @@ -476,7 +476,7 @@ changeset
    (0) tip
    - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph?style=gitweb' | egrep $REVLINKS shortlog | changelog | files | @@ -487,25 +487,25 @@ more | (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'tags?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'tags?style=gitweb' | egrep $REVLINKS tip changeset | changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'bookmarks?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'bookmarks?style=gitweb' | egrep $REVLINKS xyzzy changeset | changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'branches?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'branches?style=gitweb' | egrep $REVLINKS default changeset | changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file?style=gitweb' | egrep $REVLINKS changeset | zip | [up] dir @@ -516,7 +516,7 @@ revisions | annotate - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=gitweb&rev=all()' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=gitweb&rev=all()' | egrep $REVLINKS files | zip Thu, 01 Jan 1970 00:00:00 +0000third default tip changeset
    @@ -525,7 +525,7 @@ Thu, 01 Jan 1970 00:00:00 +0000first changeset
    - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'rev/xyzzy?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'rev/xyzzy?style=gitweb' | egrep $REVLINKS shortlog | changelog | graph | @@ -542,7 +542,7 @@ comparison | revisions - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog/xyzzy?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog/xyzzy?style=gitweb' | egrep $REVLINKS changelog | graph | files | zip | @@ -555,7 +555,7 @@ files (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy?style=gitweb' | egrep $REVLINKS shortlog | graph | files | zip | @@ -566,7 +566,7 @@ changeset
    (0) tip
    - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph/xyzzy?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph/xyzzy?style=gitweb' | egrep $REVLINKS shortlog | changelog | files | @@ -577,7 +577,7 @@ more | (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy?style=gitweb' | egrep $REVLINKS changeset | zip | [up] dir @@ -588,7 +588,7 @@ revisions | annotate - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy/foo?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy/foo?style=gitweb' | egrep $REVLINKS files | changeset | latest | @@ -601,7 +601,7 @@ 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy/foo?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy/foo?style=gitweb' | egrep $REVLINKS file | annotate | diff | @@ -616,9 +616,11 @@ file | diff | annotate + less + more (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'annotate/xyzzy/foo?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'annotate/xyzzy/foo?style=gitweb' | egrep $REVLINKS files | changeset | file | @@ -640,7 +642,7 @@ diff changeset - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'diff/xyzzy/foo?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'diff/xyzzy/foo?style=gitweb' | egrep $REVLINKS files | changeset | file | @@ -653,7 +655,7 @@ 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'comparison/xyzzy/foo?style=gitweb' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'comparison/xyzzy/foo?style=gitweb' | egrep $REVLINKS files | changeset | file | @@ -668,7 +670,7 @@ (De)referencing symbolic revisions (monoblue) - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'summary?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'summary?style=monoblue' | egrep $REVLINKS
  • zip
  • changeset | @@ -688,7 +690,7 @@ changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • zip
  • @@ -703,7 +705,7 @@ files (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • zip
  • @@ -712,31 +714,31 @@

    first

    (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph?style=monoblue' | egrep $REVLINKS
  • files
  • less more | (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'tags?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'tags?style=monoblue' | egrep $REVLINKS tip changeset | changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'bookmarks?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'bookmarks?style=monoblue' | egrep $REVLINKS xyzzy changeset | changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'branches?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'branches?style=monoblue' | egrep $REVLINKS default changeset | changelog | files - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file?style=monoblue' | egrep $REVLINKS
  • graph
  • changeset
  • zip
  • @@ -749,13 +751,13 @@ revisions | annotate - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=monoblue&rev=all()' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=monoblue&rev=all()' | egrep $REVLINKS
  • zip
  • third default tip

    second xyzzy

    first

    - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'rev/xyzzy?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'rev/xyzzy?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • raw
  • @@ -771,7 +773,7 @@ comparison | revisions - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog/xyzzy?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog/xyzzy?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • zip
  • @@ -783,7 +785,7 @@ files (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • zip
  • @@ -791,13 +793,13 @@

    first

    (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph/xyzzy?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph/xyzzy?style=monoblue' | egrep $REVLINKS
  • files
  • less more | (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy?style=monoblue' | egrep $REVLINKS
  • graph
  • changeset
  • zip
  • @@ -810,7 +812,7 @@ revisions | annotate - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy/foo?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy/foo?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • latest
  • @@ -823,7 +825,7 @@ 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy/foo?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy/foo?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • file
  • @@ -841,7 +843,7 @@ annotate (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'annotate/xyzzy/foo?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'annotate/xyzzy/foo?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • file
  • @@ -863,7 +865,7 @@ diff changeset - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'diff/xyzzy/foo?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'diff/xyzzy/foo?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • file
  • @@ -876,7 +878,7 @@
    43c799df6e75
    9d8c40cba617
    - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'comparison/xyzzy/foo?style=monoblue' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'comparison/xyzzy/foo?style=monoblue' | egrep $REVLINKS
  • graph
  • files
  • file
  • @@ -891,7 +893,7 @@ (De)referencing symbolic revisions (spartan) - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=spartan' | egrep $REVLINKS changelog graph files @@ -902,7 +904,7 @@ first navigate: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log?style=spartan' | egrep $REVLINKS shortlog graph files @@ -919,20 +921,20 @@ dir/bar foo navigate: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph?style=spartan' | egrep $REVLINKS changelog shortlog files navigate: (0) tip navigate: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'tags?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'tags?style=spartan' | egrep $REVLINKS tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'branches?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'branches?style=spartan' | egrep $REVLINKS default - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file?style=spartan' | egrep $REVLINKS changelog shortlog graph @@ -944,7 +946,7 @@ foo - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog?style=spartan&rev=all()' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog?style=spartan&rev=all()' | egrep $REVLINKS zip 9d8c40cba617 a7c1559b7bba @@ -960,7 +962,7 @@ files: dir/bar foo - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'rev/xyzzy?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'rev/xyzzy?style=spartan' | egrep $REVLINKS changelog shortlog graph @@ -972,7 +974,7 @@ 9d8c40cba617 foo - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'shortlog/xyzzy?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'shortlog/xyzzy?style=spartan' | egrep $REVLINKS changelog graph files @@ -982,7 +984,7 @@ first navigate: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy?style=spartan' | egrep $REVLINKS shortlog graph files @@ -996,14 +998,14 @@ dir/bar foo navigate: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'graph/xyzzy?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'graph/xyzzy?style=spartan' | egrep $REVLINKS changelog shortlog files navigate: (0) tip navigate: (0) tip - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy?style=spartan' | egrep $REVLINKS changelog shortlog graph @@ -1015,7 +1017,7 @@ foo - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'file/xyzzy/foo?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'file/xyzzy/foo?style=spartan' | egrep $REVLINKS changelog shortlog graph @@ -1028,7 +1030,7 @@ 9d8c40cba617 - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'log/xyzzy/foo?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'log/xyzzy/foo?style=spartan' | egrep $REVLINKS href="/atom-log/tip/foo" title="Atom feed for test:foo"> href="/rss-log/tip/foo" title="RSS feed for test:foo"> file @@ -1045,7 +1047,7 @@ (diff) (annotate) - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'annotate/xyzzy/foo?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'annotate/xyzzy/foo?style=spartan' | egrep $REVLINKS changelog shortlog graph @@ -1067,7 +1069,7 @@ diff changeset - $ "$TESTDIR/get-with-headers.py" 127.0.0.1:$HGPORT 'diff/xyzzy/foo?style=spartan' | egrep $REVLINKS + $ "$TESTDIR/get-with-headers.py" $LOCALIP:$HGPORT 'diff/xyzzy/foo?style=spartan' | egrep $REVLINKS changelog shortlog graph diff -r ed5b25874d99 -r 4baf79a77afa tests/test-hgwebdir.t --- a/tests/test-hgwebdir.t Thu Mar 23 19:54:59 2017 -0700 +++ b/tests/test-hgwebdir.t Fri Mar 24 08:37:26 2017 -0700 @@ -1421,7 +1421,7 @@ > EOF $ hg serve -d --pid-file=hg.pid --web-conf paths.conf \ > -A access-paths.log -E error-paths-9.log - listening at http://*:$HGPORT1/ (bound to 127.0.0.1:$HGPORT1) (glob) + listening at http://*:$HGPORT1/ (bound to *$LOCALIP*:$HGPORT1) (glob) $ cat hg.pid >> $DAEMON_PIDS $ get-with-headers.py localhost:$HGPORT1 '?style=raw' 200 Script output follows @@ -1433,7 +1433,7 @@ $ killdaemons.py $ hg serve -p $HGPORT2 -d -v --pid-file=hg.pid --web-conf paths.conf \ > -A access-paths.log -E error-paths-10.log - listening at http://*:$HGPORT2/ (bound to 127.0.0.1:$HGPORT2) (glob) + listening at http://*:$HGPORT2/ (bound to *$LOCALIP*:$HGPORT2) (glob) $ cat hg.pid >> $DAEMON_PIDS $ get-with-headers.py localhost:$HGPORT2 '?style=raw' 200 Script output follows @@ -1566,6 +1566,119 @@ /b/ /c/ + $ killdaemons.py + $ cat > paths.conf << EOF + > [paths] + > /dir1/a_repo = $root/a + > /dir1/a_repo/b_repo = $root/b + > /dir1/dir2/index = $root/b + > EOF + $ hg serve -p $HGPORT1 -d --pid-file hg.pid --webdir-conf paths.conf + $ cat hg.pid >> $DAEMON_PIDS + + $ echo 'index file' > $root/a/index + $ hg --cwd $root/a ci -Am 'add index file' + adding index + + $ get-with-headers.py localhost:$HGPORT1 '' | grep 'a_repo' + dir1/a_repo + + dir1/a_repo/b_repo + + + $ get-with-headers.py localhost:$HGPORT1 'index' | grep 'a_repo' + dir1/a_repo + + dir1/a_repo/b_repo + + + $ get-with-headers.py localhost:$HGPORT1 'dir1' | grep 'a_repo' + a_repo + + a_repo/b_repo + + + $ get-with-headers.py localhost:$HGPORT1 'dir1/index' | grep 'a_repo' + a_repo + + a_repo/b_repo + + + $ get-with-headers.py localhost:$HGPORT1 'dir1/a_repo' | grep 'a_repo' + + + + dir1/a_repo: log + href="/dir1/a_repo/atom-log" title="Atom feed for dir1/a_repo" /> + href="/dir1/a_repo/rss-log" title="RSS feed for dir1/a_repo" /> + mercurial +
  • graph
  • +
  • tags
  • +
  • bookmarks
  • +
  • branches
  • +
  • changeset
  • +
  • browse
  • +
  • help
  • + + + +