hgeditor
author Gregory Szorc <gregory.szorc@gmail.com>
Tue, 28 Aug 2018 15:02:48 -0700
changeset 39438 aeb551a3bb8a
parent 26781 1aee2ab0f902
permissions -rwxr-xr-x
cborutil: implement sans I/O decoder The vendored CBOR package decodes by calling read(n) on an object. There are a number of disadvantages to this: * Uses blocking I/O. If sufficient data is not available, the decoder will hang until it is. * No support for partial reads. If the read(n) returns less data than requested, the decoder raises an error. * Requires the use of a file like object. If the original data is in say a buffer, we need to "cast" it to e.g. a BytesIO to appease the decoder. In addition, the vendored CBOR decoder doesn't provide flexibility that we desire. Specifically: * It buffers indefinite length bytestrings instead of streaming them. * It doesn't allow limiting the set of types that can be decoded. This property is useful when implementing a "hardened" decoder that is less susceptible to abusive input. * It doesn't provide sufficient "hook points" and introspection to institute checks around behavior. These are useful for implementing a "hardened" decoder. This all adds up to a reasonable set of justifications for writing our own decoder. So, this commit implements our own CBOR decoder. At the heart of the decoder is a function that decodes a single "item" from a buffer. This item can be a complete simple value or a special value, such as "start of array." Using this function, we can build a decoder that effectively iterates over the stream of decoded items and builds up higher-level values, such as arrays, maps, sets, and indefinite length bytestrings. And we can do this without performing I/O in the decoder itself. The core of the sans I/O decoder will probably not be used directly. Instead, it is expected that we'll build utility functions for invoking the decoder given specific input types. This will allow extreme flexibility in how data is delivered to the decoder. I'm pretty happy with the state of the decoder modulo the TODO items to track wanted features to help with a "hardened" decoder. The one thing I could be convinced to change is the handling of semantic tags. Since we only support a single semantic tag (sets), I thought it would be easier to handle them inline in decodeitem(). This is simpler now. But if we add support for other semantic tags, it will likely be easier to move semantic tag handling outside of decodeitem(). But, properly supporting semantic tags opens up a whole can of worms, as many semantic tags imply new types. I'm optimistic we won't need these in Mercurial. But who knows. I'm also pretty happy with the test coverage. Writing comprehensive tests for partial decoding did flush out a handful of bugs. One general improvement to testing would be fuzz testing for partial decoding. I may implement that later. I also anticipate switching the wire protocol code to this new decoder will flush out any lingering bugs. Differential Revision: https://phab.mercurial-scm.org/D4414
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
544
3d4d5f2aba9a Remove bashisms and use /bin/sh instead of /bin/bash.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 484
diff changeset
     1
#!/bin/sh
186
9a2075c0b9b8 Add $HGEDITOR hook and example script
mpm@selenic.com
parents:
diff changeset
     2
#
1599
f93fde8f5027 remove the gpg stuff from hgeditor (superseded by the signing extension)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 1009
diff changeset
     3
# This is an example of using HGEDITOR to create of diff to review the
26781
1aee2ab0f902 spelling: trivial spell checking
Mads Kiilerich <madski@unity3d.com>
parents: 11266
diff changeset
     4
# changes while committing.
684
4ccf3de52989 Turn off signing with hgeditor by default
Matt Mackall <mpm@selenic.com>
parents: 683
diff changeset
     5
666
0100a43788ca hgeditor: Remove EMAIL default for HGUSER, comment editor selection
Radoslaw "AstralStorm" Szkodzinski <astralstorm@gorzow.mm.pl>
parents: 665
diff changeset
     6
# If you want to pass your favourite editor some other parameters
0100a43788ca hgeditor: Remove EMAIL default for HGUSER, comment editor selection
Radoslaw "AstralStorm" Szkodzinski <astralstorm@gorzow.mm.pl>
parents: 665
diff changeset
     7
# only for Mercurial, modify this:
796
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
     8
case "${EDITOR}" in
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
     9
    "")
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    10
        EDITOR="vi"
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    11
        ;;
348
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    12
    emacs)
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    13
        EDITOR="$EDITOR -nw"
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    14
        ;;
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    15
    gvim|vim)
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    16
        EDITOR="$EDITOR -f -o"
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    17
        ;;
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    18
esac
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    19
796
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    20
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    21
HGTMP=""
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    22
cleanup_exit() {
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    23
    rm -rf "$HGTMP"
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    24
}
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    25
754
3e73bf876f17 Fixes and cleanups to hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 684
diff changeset
    26
# Remove temporary files even if we get interrupted
831
232d0616a80a Cleaned up trap handling:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 814
diff changeset
    27
trap "cleanup_exit" 0 # normal exit
11190
43337076ba92 Fixed a bashism with trap numbers in hgeditor.
Javi Merino <cibervicho@gmail.com>
parents: 4687
diff changeset
    28
trap "exit 255" HUP INT QUIT ABRT TERM
796
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    29
11266
2b440bb8a66b Fixed a bashism with the use of $RANDOM in hgeditor.
Javi Merino <cibervicho@gmail.com>
parents: 11190
diff changeset
    30
HGTMP=$(mktemp -d ${TMPDIR-/tmp}/hgeditor.XXXXXX)
2b440bb8a66b Fixed a bashism with the use of $RANDOM in hgeditor.
Javi Merino <cibervicho@gmail.com>
parents: 11190
diff changeset
    31
[ x$HGTMP != x -a -d $HGTMP ] || {
2b440bb8a66b Fixed a bashism with the use of $RANDOM in hgeditor.
Javi Merino <cibervicho@gmail.com>
parents: 11190
diff changeset
    32
  echo "Could not create temporary directory! Exiting." 1>&2
2b440bb8a66b Fixed a bashism with the use of $RANDOM in hgeditor.
Javi Merino <cibervicho@gmail.com>
parents: 11190
diff changeset
    33
  exit 1
796
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    34
}
33a272b79e54 Replaced mktemp and usage of ${par:=word}.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 769
diff changeset
    35
754
3e73bf876f17 Fixes and cleanups to hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 684
diff changeset
    36
(
3e73bf876f17 Fixes and cleanups to hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 684
diff changeset
    37
    grep '^HG: changed' "$1" | cut -b 13- | while read changed; do
4687
b5bbfa18daf7 hgeditor: Use $HG to run 'hg diff' (see 849f011dbf79)
Thomas Arendsen Hein <thomas@intevation.de>
parents: 4659
diff changeset
    38
        "$HG" diff "$changed" >> "$HGTMP/diff"
754
3e73bf876f17 Fixes and cleanups to hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 684
diff changeset
    39
    done
3e73bf876f17 Fixes and cleanups to hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 684
diff changeset
    40
)
348
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    41
1599
f93fde8f5027 remove the gpg stuff from hgeditor (superseded by the signing extension)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 1009
diff changeset
    42
cat "$1" > "$HGTMP/msg"
684
4ccf3de52989 Turn off signing with hgeditor by default
Matt Mackall <mpm@selenic.com>
parents: 683
diff changeset
    43
3025
d9b8d28c0b94 Find the system's MD5 binary.
Will Maier <willmaier@ml1.net>
parents: 1706
diff changeset
    44
MD5=$(which md5sum 2>/dev/null) || \
4659
7a7d4937272b Kill trailing spaces
Thomas Arendsen Hein <thomas@intevation.de>
parents: 3025
diff changeset
    45
    MD5=$(which md5 2>/dev/null)
3025
d9b8d28c0b94 Find the system's MD5 binary.
Will Maier <willmaier@ml1.net>
parents: 1706
diff changeset
    46
[ -x "${MD5}" ] && CHECKSUM=`${MD5} "$HGTMP/msg"`
1009
1bc619b12025 Don't show the diff in hgeditor if there are no changes in file contents.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 839
diff changeset
    47
if [ -s "$HGTMP/diff" ]; then
1bc619b12025 Don't show the diff in hgeditor if there are no changes in file contents.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 839
diff changeset
    48
    $EDITOR "$HGTMP/msg" "$HGTMP/diff" || exit $?
1bc619b12025 Don't show the diff in hgeditor if there are no changes in file contents.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 839
diff changeset
    49
else
1bc619b12025 Don't show the diff in hgeditor if there are no changes in file contents.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 839
diff changeset
    50
    $EDITOR "$HGTMP/msg" || exit $?
1bc619b12025 Don't show the diff in hgeditor if there are no changes in file contents.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 839
diff changeset
    51
fi
3025
d9b8d28c0b94 Find the system's MD5 binary.
Will Maier <willmaier@ml1.net>
parents: 1706
diff changeset
    52
[ -x "${MD5}" ] && (echo "$CHECKSUM" | ${MD5} -c >/dev/null 2>&1 && exit 13)
754
3e73bf876f17 Fixes and cleanups to hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 684
diff changeset
    53
1599
f93fde8f5027 remove the gpg stuff from hgeditor (superseded by the signing extension)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 1009
diff changeset
    54
mv "$HGTMP/msg" "$1"
348
442eb02cf870 Improved hgeditor:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 280
diff changeset
    55
831
232d0616a80a Cleaned up trap handling:
Thomas Arendsen Hein <thomas@intevation.de>
parents: 814
diff changeset
    56
exit $?